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1.  Introduction 


The  processing  of  geographic  data  has  reached  a high  level  of  automation. 
Automation  has  affected  data  gathering,  data  storage,  data  combination,  and 
map  compilation.  Airborne  sensors,  such  as  those  in  Landsat,  are  collecting 
image  data  at  an  incredible  rate.  In  order  to  process,  combine,  and  exchange 
data,  institutions  in  the  field  of  geographic  data  processing  are  archiving 
image  data  and  symbolic  abstractions  of  it  in  digital  computer  useable  form. 

Computer  based  systems  have  been  built  to  provide  the  function  of  rapid  map 
revision.  Altering  a map  encoded  in  digital  electronic  storage  requires  only 
that  the  small  number  of  changing  features  be  altered  in  representation.  Auto- 
matic drafting  is  readily  available  for  producing  standard  map  sheets  from 
computer  stored  data. 

There  is  one  great  bottleneck  in  automated  cartography  - - the  symbolic 
features  which  are  to  represent  raw  input  imagery  must  be  extracted  by  human 
operators.  Generally  this  is  a tedious  and  time  consuming  process  of  identifying 
a feature  to  the  computer  and  then  manually  tracing  its  extent  on  the  source 
material  mounted  on  a digitizing  table.  The  remaining  problem  is  therefore  that 
of  automatic  extraction  of  symbolic  cartographic  features  from  source  imagery. 

The  payoff  earned  by  solving  this  problem  would  be  astronomical.  There  is  a greatly 
increasing  requirement  for  timely  image  analysis  in  order  to  manage  earth  resources. 
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to  monitor  polution,  to  understand  physical  phenomena,  to  record  and  tax  land 
use,  to  plan  and  assess  the  impact  of  construction  projects,  to  provide  recon- 
naissance in  hostile  areas,  to  manage  agriculture,  to  monitor  hydrology,  etc. 

While  the  promise  is  great  and  the  search  is  intense,  we  are  still  far  from 
a general  solution  to  the  feature  extraction  problem.  There  are  many  researchers 
in  the  field  doing  work  under  various  titles  such  as  pattern  recognition,  image 
or  picture  processing,  computer  vision,  robotics,  scene  analysis,  or  artificial 
intelligence  (A. I.).  It  is  now  apparent  that  the  problem  must  be  approached  by 
taking  many  small  steps  since  dramatic  results  have  eluded  the  Best  of  seekers 
for  20  years.  One  of  the  goals  of  the  research  reported  here  was  to  identify 
some  of  the  steps  to  be  tried  as  part  of  the  overall  solution  to  automation  of 
feature  extraction  for  cartographic  compilation. 

In  order  to  understand  the  current  state  of  automation  in  cartography  and 
the  position  of  feature  extraction  in  it,  a systems  oriented  approach  was  taken 
to  structure  the  study.  An  assessment  was  made  regarding  what  components  of  an 
automatic  system  are  currently  available  and  what  components  need  to  be  created. 
The  hypothetical  system  desired  is  called  ACES  for  "Automatic  Cartographic 
Extraction  System".  Output  products  desired  from  the  system  are  discussed  first 
in  Section  2 along  with  current  automation  for  producing  them.  Available  Inputs 
and  devices  for  producing  them  are  considered  in  Section  3.  Section  4 gives  a 
general  discussion  of  components  of  ACES  which  must  perform  the  transformation  of 
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data.  The  knowledge  component  necessary  for  ACES,  currently  supplied  by 
humans,  is  studied  In  Section  5 and  paradigms  for  applying  knowledge  in  the 
feature  extraction  process  Is  created  in  Section  b.  Section  7 presents  a 
general  overview  of  possible  ACES  process  control.  Outstanding  problems 
requiring  work  are  discussed  in  Section  8 and  future  research  directed  toward 
solution  of  some  of  these  problems  is  outlined  in  Section  9. 

A summary  of  the  conclusions  reached,  discussed  in  Section  10,  are  as  follows. 
The  cartographic  feature  extraction  problem  is  a difficult  problem  with  difficult 
subproblems.  Certain  of  the  subproblems,  such  as  registration  and  use  of  a geo- 
graphic data  base  as  a knowledge  source,  appear  to  have  reasonable  solutions.  The 
engineering  of  knowledge  for  use  in  cartographic  feature  extraction  is  the  key 
issue.  No  unified  paradigm  exists  but  several  individual  paradigms  exist  which 
potentially  solve  some  subproblems.  Dependence  on  the  use  of  knowledge  makes  the 
ACES  issue  generic  to  other  central  issues  of  A. I.  It  is  the  most  important  issue 
and  there  is  much  useful  work  to  be  done  on  it. 
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2.  Automatic  Cartographic  Extraction  System  Output  Objectives 


Since  a system  is  originally  conceived  to  satisfy  some  specified  objective, 
it  is  best  to  begin  by  noting  the  desired  products  which  the  system  is  to  produce. 
There  are  several  products  which  we  could  expect  from  ACES.  First  of  all,  we 
would  like  to  produce  standard  cartographic  sheets  and  thematic  maps  and  over- 
lays. Secondly,  we  would  like  to  create  special  non  standard  products  of  the 
same  general  kind  by  quick  compilation  from  the  standard  digital  data  base. 
Thirdly,  there  is  great  interest  in  creating  digital  data  bases  to  be  used 
by  other  automatic  systems.  Such  systems  include  navigational  systems  and  en- 
viromental  or  traffic  modeling  systems. 


Even  the  production  of  thematic  maps  requires  some  interpretation  and 
symbolization  of  data:  thus  it  can  be  assumed  that  all  ACES  output  is  in  digital  form. 
Map  sheets  may  be  created  from  digital  ACES  output  products  by  D/A  conversion 
such  as  plotting.  Thus,  because  of  the  interpretation  and  symbolization  ACES 
must  perform  in  processing  data,  it  follows  that  ACES  logic  will  all  be  digital 
and  that  all  analogue  inputs  must  be  converted  before  interpretation.  The 
general  ACES  objective  is  therefore  to  create  digital  data  bases.  The  exact  for- 
mat of  the  data  bases  is  left  undefined  in  this  report,  but  Sections  4,5,  and  6 
discuss  in  some  detail  types  of  information  to  be  stored.  Format  will  vary 
according  to  the  process  that  consumes  the  data.  In  particular,  ACES  will  be 
one  of  the  larger  users  of  its  own  product  due  to  the  fact  that  it  must  use  stored 
knowledge  to  aid  in  interpretation  of  new  input  imagery. 
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2.L  Cartographic  details:  resolution,  scale,  and  positioning 

Aerial  imagery  is  already  a primary  input  source  for  map 
making  as  well  as  for  land  use  classification.  Most  map 
compilation  techniques  have  used  human  interpretation  and  human 
mensuration  of  the  input  data.  The  level  of  automation  is, 
however,  steadily  increasing.  For  instance,  it  is  now  possible 
to  make  elevation  maps  semi-automatically  from  stereo  imagery  in  a 
production  shop.  Thematic  maps  of  ERTS  imagery  are  also 
commonly  produced.  Although  there  are  differences  in  data 
collection  techniques  between  airplane  and  satellite  surveys 
there  are  no  conceptual  differences  between  them  from  the 
standpoint  of  automatic  map  compilation. 

It  is  assumed  that  input  to  an  automated  cartographic 
system  is  a matrix  of  spectral  signatures  collected  from  some 
"rectangular"  window  of  the  earth's  surface.  The  problems  of 
restoration  and  geometric  transformation  necessary  to  produce 
the  input  matrix  are  not  addressed  here.  (There  is  evidence 
that  these  problems  have  acceptable  solutions  in  the  interesting 
cases.)  For  practical  purposes  an  aerial  photograph  can  be 
regarded  as  a matrix  of  infinitely  m<  'ty  spectral  samples  of 
the  earth's  radiance.  For  digital  processing,  however,  the 
window  is  represented  as  a matrix  of  m rows  and  n columns  of 
pixels  each  of  which  is  obtained  by  integrating  continuous 
radiance  samples  over  some  grid  cell  or  resolution  element . 
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It  is  not  necessary  to  produce  photographs  for  automatic  mapping. 

It  is  quite  common  for  imagery  to  be  converted  to  digital  form 
at  the  platform  (plane  or  satellite)  and  transmitted  to  earth. 

The  ERTS-1  satellite  sampled  and  transmitted  spectral  signatures 
of  60  meter  (m)  by  80  m units  of  the  earth’s  surface.  Seven 
million  of  these  units  created  a digital  image,  or  frame, 
which  represented  a window  of  roughly  185  km  by  185  km  on  the 
earth.  Various  sizes  of  resolution  elements  are  possible 
depending  upon  the  height  of  the  imaging  platform  and  sophisti- 
cation of  equipment.  Weather  satellites  deliver  resolution 

elements  800  m on  a side  while  Skylab  cameras  are  capable  of  12m  units. 

Digitally  produced  maps  should  have  a map  resolut ion  of 
0.05  mm  based  on  the  experimental  evidence  that  a human  can 
resolve  10  line  pairs  per  millimeter.  The  experience  of  the 
Defense  Mapping  Agency  Topographic  Command  confirms  this 
estimate:  a grid  spacing  of  0.004  in.  = 0.10  mm  was  found  to 

be  too  rough  for  the  eye  while  0.002  in.  = 0.05  mm  was  suitable. 

Maps  output  by  the  GISTS  (Graphic  Improvement  Software  Test  System)  system 
at  DMATC  (See  Burdette  et  al.f  1973.)  can  be  viewed  as  very  large  matrices 
of  2 14  by  2 14  resolution  elements. 

Accepting  0.05  mm  = 5 x 10‘5  m as  the  required  map  resolu- 
tion yields  a simple  relationship  between  map  scale  and  ground 
resolut ion . 
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Rg  = 5 x 10-5  Sm  (Eq.  2.1) 

where  is  ground  resolution  in  meters  and  Sm  is  map  scale. 

(See  Doyle,  1973,  for  a similar  development.)  Taking  ERTS 
resolution  to  be  70  m on  the  ground,  a map  scale  of  1:1,400,000 
is  appropriate  for  a 1 to  1 correspondence  between  ground 
resolution  elements  and  map  resolution  elements.  Deviation 
from  this  scale  toward  smaller  scales  (i.e.  1:5,000,000)  implies 
a waste  of  sampling  and  storage  effort  while  deviation  toward 
larger  scales  (i.e.  1:25,000)  implies  errors  in  representation. 

Each  point  on  the  earth's  surface  can  be  assigned  a 
position  with  respect  to  some  local  or  global  coordinate  system. 
U.S.  map  accuracy  standards  specify  that  the  standard  error  for 
point  positions  should  not  exceed  0.3  mm  on  the  map.  Thus  the 
point  positioning  accuracy  limit  for  a mapping  technique  is 
as  follows. 

°p  : 3 * l0_'  sm  <E<1-  2-2> 

where  a,,  is  the  standard  error  in  meters  and  Sm  is  the  map  scale. 
(See  Doyle,  1973.  ) 

Ground  control  points  can  be  positioned  from  low  altitude 
photography  to  within  1 m and  from  ERTS  imagery  50  m 
accuracy  can  be  achieved  (Van  Wie,  1977).  This  is  good  enough 
for  current  purposes  and  is  likely  to  be  further  improved 
with  the  future  placement  of  navigation  satellites. 

According  to  Doyle  (Doyle,  1973)  the  type  of  photographic 
equipment  in  the  Skylab  satellite  is  capable  of  12  m ground 
resolution  and  10  m positional  accuracy,  which  nearly 
allows  compilation  of  maps  of  scale  1:25,000  if  equations 
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2.1  and  2.2  are  checked.  While  this  map  scale  is  not  large 
enough  for  reconnaissance  or  tax  assessment  purposes  it  is 
sufficient  for  a large  number  of  applications.  With  1 m 
resolution  and  1 m positioning  nearly  all  types  of  maps  could 
be  produced. 
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2.2  ACES  output  products 


It  was  previously  stated  that  there  are  3 general  classes  of 
output  to  be  generated  by  ACES.  The  first  of  these  classes  contains 
standard  cartographic  sheets  and  thematic  maps.  Section  2.1  concluded 
that  maps  of  scale  1:25000  could  currently  be  compiled  using  satellite 
platforms  and  that  scales  as  large  as  1:2500  were  probably  achievable 
with  military  satellites.  Once  cartographic  feature  extraction  is 
accomplished  with  the  symbolic  results  stored  in  computer  readable  form, 
many  specialized  products  could  rapidly  be  produced.  Special  hill- 
shading techniques  could  be  used,  for  instance,  to  dramatize  relief 
for  pilots  about  to  fly  over  a given  area  at  a given  time  of  day. 
Thematic  mapping,  made  more  popular  by  the  advent  of  Landsat,  is  a con- 
venient way  of  using  the  grid-cell  method  of  land  feature  symbolization. 
Lending  themselves  readily  to  automat ic  classification,  thematic  maps  in- 
dicate the  land  class  of  each  ground  resolution  element  without  much 
consideration  of  its  relationship  (local  or  global)  to  other  elements. 
The  global  feature  structure  of  the  mapped  terrain  is  not  symbolized 
in  the  computer  data  base  but  hopefully  will  be  supplied  by  the  per- 
ceptual system  of  the  human  user  of  the  map. 

Once  geographic  data  is  compiled  and  archived,  it  will  be  possible 
to  rapidly  produce  special  products  by  selection  and  combination.  For 
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example,  an  engineer  may  have  Che  need  to  know  all  parts  of  a region 
where  ground  elevation  is  less  than  1200  feet.  A two-color  thematic 
map  generated  from  an  elevation  matrix  might  satisfy  his  requirement. 

As  another  example,  a military  commander  might  want  to  see  a "contour 
plot"  of  a t r a f f icab i 1 i t y function  t(x,y)  indicating  the  mechanical 
properties  of  the  soil  in  a given  region.  Such  special  products  could 
be  generated  from  ACES  archives  as  the  need  would  arise  and  would  not 
necessarily  impose  any  considerations  on  the  system  design. 

ACES  archives  could  also  provide  digital  data  bases  for  other 
automatic  terrain  analysis  systems.  For  instance  elevation  matrices 
and  drainage  features  compiled  in  ACES  could  be  input  to  hydrological 
modeling  programs.  All  features  might  be  necessary  as  input  to  a 
traf f icability  program  whose  job  it  is  to  find  the  best  path  of  travel 
for  a truck  going  from  point  A to  point  B.  As  yet  another  example,  two 
archived  maps  of  the  same  area  made  for  different  dates  could  be  com- 
pared to  produce  a map  of  change.  These  uses  of  ACES  data  do  not  really 
depend  on  automatic  feature  extraction  and  are  currently  possible  with 
current  geographic  data  base  systems  which  depend  on  human  feature 
recognition. 

2.3  ACES  performance:  cost  and  quality 

i 

I 

Fast  automation  techniques  have  already  shown  that  acceptable 
quality  can  sometimes  be  maintained  with  possible  cost  reduction.  Cost 
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can  be  measured  in  dollars,  in  time,  or  in  the  number  of  errors  in  the 
product.  Quality  is  currently  dependent  on  human  feature  extraction: 
automatic  elevation  contouring,  for  instance,  does  not  produce  results 
as  satisfying  as  those  done  by  a cartographer.  It  will  probably  remain 
true  for  some  time  that  automatically  compiled  products  will  be  notice- 
ablly  more  coarse  than  those  done  by  humans  and  that  progressive  re- 
finement will  cost  more  and  more  to  achieve.  One  answer  to  this  pro- 
blem is  to  economically  produce  the  coarse  maps  knowing  that  human 
consumers  have  the  perceptual  capability  to  smooth  the  interpretation 
when  necessary.  The  thematic  maps  produced  automatically  by  EROS  from 
LANDSAT  imagery  exemplify  this  philosophy.  Achievment  of  good  quality 
mapping  by  automatic  means  regardless  of  cost  is  an  important  research 
goal.  If  and  when  that  goal  is  reached,  another  period  of  development 
will  be  required  to  produce  that  quality  at  a cost  competitive  with 
present  manual  methods. 

2.4  Output  devices  and  techniques 

Various  sophisticated  devices  are  currently  available  to  meet  map 
accuracy  standards  in  production  plotting  or  to  allow  graphic  inter- 
active exploration  of  a geographic  data  base.  Production  output  devices 
include  random  vector  plotters  (drum  or  flatbed),  raster  plotters  (drum 
or  flatbed),  and  electron  beam  recorders.  Both  raster  and  vector  plotters 
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can  output  a 5x10  symbolic  point  map  at  0.002  inch  resolution  in 
less  than  two  hours.  The  choice  of  a particular  output  device  will 
depend  upon  further  reproduction  steps.  Currently  there  is  a trend 
toward  raster  plotters  and  away  from  established  vector  plotting. 

There  are  algorithms  for  converting  between  symbolic  raster  and 
vector  r ep r es en t a t io ns wh i eh  present  no  problems  in  batch  modefi.e.  in 
production  mode).  However,  for  graphical  presentation  to  an  inter- 
active human,  the  conversion  algorithms  are  time  consuming.  If  the 
geographic  data  base  (GDB)  uses  vector  representation,  a sort  (and 
perhaps  broadening)  is  required  for  raster  presentation.  If  a raster 
representation  is  stored,  then  a tracking  (and  perhaps  thinning)  oper- 
ation is  needed.  Accessing  and  converting  the  data  in  this  manner 
places  a bottleneck  in  the  system,  not  only  for  geographical  output 
but  also  for  delivery  to  data  processing  programs  which  operate  with 
a representation  other  than  that  used  in  the  GDB.  There  is  no  apparent 
solution  to  this  data  representation  problem  other  than  bearing  the 
cost  (in  time  and  processing)  of  conversion  because  neither  raster  nor 
vector  representation  is  sufficient  for  all  processing  and  it  seems 
futile  to  maintain  data  in  both  forms. 

Section  3 discusses  input  devices  which  often  double  as  output 
devices,  so  further  discussion  is  deferred  until  that  section.  A 
similar  assessment  is  made  for  both  input  and  output  technology  - - 
the  current  capabilities  far  exceed  the  current  capability  to  meaning- 
fully interpret  the  data  automatically. 
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3.  System  input  requirements 


In  order  to  derive  the  geographic  data  products  discussed  in  Section  2, 
ACES  must  be  provided  with  varied  input.  Several  forms  of  input  are  necessary, 
the  most  obvious  of  which  is  aerial  imagery  containing  current  sensor  inform- 
ation. Other  inputs  are  base  maps  and  names  which  are  not  apparent  in  imagery 
and  knowledge  for  interpreting  image  data.  Knowledge  may  be  implemented  by  a 
combination  of  software  and  data  base  or  by  an  interactive  human  analyst. 


3.1  Aerial  imagery 

Up-to-date  geographic  information  is  easily  obtained  via  aerial  imagery 
from  a variety  of  sensors.  Black  and  white  stereo  photography  has  been  the  most 
common  form  in  the  past  but  other  sensors  such  as  infrared  and  radar  are  gaining 
in  use.  Multispectral  imagery,  where  a set  of  registered  images  from  different 
sensors  is  produced,  has  been  the  subject  of  intense  effort  since  1972  due  to  the 
LANDSAT  program.  Due  to  the  limited  use  of  knowledge  by  automatic  processes  rela- 
tive to  human  photogrammetr ists  it  appears  that  multispectral  imagery  will  be  pre- 
ferred for  some  time  to  come  in  automatic  analysis  tasks.  Major  problems  in  using 
multispectral  imagery  are  1)  registration  of  sensors,  2)  different  resolution  of 
sensors,  3)  low  resolution  of  some  sensors,  and  4)  high  volumes  of  data.  Huge 
amounts  of  B&W  photography  are  gathered  and  analyzed  for  current  information  needs. 

) , > , } } ) 
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It  is  estimated  that  the  Air  Force  alone  obtains  10  million  pictures  each  year. 
Thus  the  tremendous  momentum  of  current  manual  procedures  will  surely  influence 


future  experiments  in  automatic  methods. 

31.1  Black  and  white  stereo 

Black  and  white  (B&W)  stereo  photography  is  the  most  common  input  to  map 
compilation.  Highly  developed  hardware/software  systems  exist  for  rapid  extract- 
ion of  information.  Equipment  such  as  UNAMACE  [ UNAMACE  1968  ] exists  for 

nearly  automatic  extraction  of  elevation  matrices  from  stereo  pairs.  Extraction 
of  features  such  as  road  or  drainage  networks  or  vegetation  windows  is  manually 
guided.  While  the  human  provides  the  pattern-recognition  and  processing  control 
capabilities  the  machine  handles  rectification,  coordinate  transformation,  and 
digital  storage  [ Dubuisson  1977  ].  Automatic  extraction  of  elevation  data 

is  successful  because  it  depends  only  on  low-level  techniques  which  correlate 
images  taken  under  almost  identical  conditions  and  which  are  already  approximately 
registered  according  to  flight  control  information.  Even  so,  extraction  of  elev- 
ations in  uniformly  textured  regions,  such  as  forests,  can  be  difficult  and  errorful. 
It  would  be  a great  accomplishment  if  automatic  tracking  of  lineal  features  in  B&W 
photography  could  be  achieved.  Knowledge  more  sophisticated  than  a stereo  model 
surely  will  be  required.  A proposal  for  using  an  existing  map  and  elevation  data 
for  automatic  tracking  of  drainage  is  discussed  in  Section  9.  In  general  analysis 
of  B&W  imagery  requires  the  use  of  large  neighborhoods  and  global  knowledge  - - 
appropriate  for  human  analysis,  problematical  for  automatic  analysis. 
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3.1.2  Multisensor  imagery 


The  chief  characteristic  of  multisensor  imagery  is  that  for  each  small 
geographic  neighborhood  several  signals  are  available  which  when  combined 
give  strong  indication  of  the  material  being  viewed.  Knowledge  of  the  material 
present  at  geographic  position  (x,y)  allows  for  simpler  tracking  of  cartographic 
features  than  is  possible  with  B&W  imagery.  Unfortunately,  there  have  been  few 
experiments  reported  where  attempts  were  made  to  automatically  track  lineal 
features  in  multisensory  imagery.  This  is  due  in  a large  part  to  the  concen- 
tration of  research  on  low  resolution  LANDSAT  data  where  most  cartographically 
interesting  lineal  features  are  beyond  recognition.  Higher  resolution  multi- 
sensory imagery  is  not  readily  available  - - certainly  not  to  the  academic 
community . 

Collecting  multisensor  data  is  not  as  simple  as  gathering  black  and  white 
imagery.  Multispectral  sensed  data  (MSS)  is  gotten  by  splitting  a beam  of  light 
reflected  from  earth  element  (x,y)  and  filtering  the  branching  beams  to  get  tonal 
samples  in  each  of  the  bands.  Accurate  registration  is  assured  by  beam  splitting. 

If  different  sensors  or  different  platforms  are  used  registration  becomes  more 
difficult.  Radar  and  infrared  sensors  have  lower  resolution  capabilities  than 
visible  light  and  this  presents  additional  problems.  A vector  of  tonal  samples 
can,  however,  be  derived  for  each  earth  element  (x,y)  in  an  output  grid  by  using 
a transformation  system  as  implemented  in  DTEDS[Jancaitis  1977 ] or  DIRS  [Van  Wie,  1977]. 
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To  do  this  separate  coordinate  transformations  T.(x,y)  must  be  gotten  for 
each  band  i of  sensory  input.  Some  interpretation  of  each  image  i is  necessary 
to  arrive  at  T ^ , such  as  specification  of  control  points  or  recognition  of 
structural  features.  This  presents  no  theoretical  problem  but  can  result  in 
a great  deal  of  computation  in  practice. 
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i.2  Cartographic  name  file 


Names  of  geographic  features  are  abstractions  and  are  not  at  all  evident 
in  aerial  imagery.  Yet,  names  are  essential  for  recognition  and  communication 
in  geographic  displays.  Names  are  really  tags  for  concepts  which  include  both 
qualitative  and  quantitative  information.  For  instance  the  name  of  a city 
could  evoke  a latitude  and  longitude,  a population,  an  elevation,  an  average 
temperature  and  rainfall,  political  entities,  etc.  A name  often  links  to  other 
names;  for  instance,  Philadelphia  is  in  the  state  of  Pennsylvania,  located  on 
the  Delaware  River,  and  has  Rizzo  as  mayor.  There  appears  to  be  no  end  to  the 
complexity  which  name  concepts  can  create  and  no  unique  way  for  organizing  name 
data  for  storage  and  retrieval. 

For  standard  cartographic  expression  file  formats  for  name  data  can  be 
established.  U.S.G.S.  has  developed  the  GYPSY  system  |Orth  19  74  ] for  storage 
and  retrieval  of  name  data.  Clearly  only  minimal  information  must  be  stored 
with  a name  for  use  in  creating  standard  map  symbolization.  For  instance,  for 
small  towns  the  geographic  coordinates  and  population  should  suffice.  However, 
to  provide  for  creation  of  specialized  thematic  products  it  may  be  important  to 
have  information  Items  such  as  amount  of  electricity  consumed,  type  of  water 
system,  number  of  Exxon  service  stations,  or  percentage  of  the  population  under 
40.  Clearly  some  range  of  output  products  must  be  specified  before  a practical 
data  base  can  be  established. 
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Names  can  be  attached  to  any  of  the  three  basic  geographic  data  types 
- - points  (i.e.  a mountain  peak),  lines  (i.e.  a river),  and  regions  (i.e.  a 
desert).  Names  will  be  absolutely  essential  in  symbolizing  output  displays. 
The  concept  associated  with  the  name  is  likely  to  be  useful  in  interpretation 
of  imagery  which  might  affect  other  symbolization  on  the  map.  For  instance, 
knowledge  of  the  location  of  a large  city  might  cause  other  interpretation 
and  symbolization  to  be  suppressed  within  a specified  region  of  the  imagery. 
Theoretically  the  opportunities  for  applying  a priori  knowledge  stored  in  a 
file  of  named  geographic  objects  are  unlimited,  however,  in  practice  only  very 
constrained  uses  are  likely  to  prove  possible. 
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3.3  Base  maps  and  bootstrapping 


Stored  base  maps  will  be  an  essential  component  of  any  ACES  type  system. 

Base  maps  can  be  thought  of  as  skeletons  on  which  the  flesh  of  current  detail 
must  be  hung.  The  source  of  the  detail  is  aerial  imagery  and  its  content  will 
be  specific  to  the  display  task.  Base  maps  are  available  at  some  scale  in 
digital  form  for  all  major  political  divisions  of  the  world.  Important  carto- 
graphic detail  such  as  water  bodies  and  major  roads  should  also  be  readily 
available  [Robe  1974] . 

Registration  is  the  key  to  the  storehouse  of  knowledge  in  the  base  maps. 

Major  features  apparent  in  the  imagery  must  be  brought  into  correspondence  with 
mapped  features.  The  base  map  could  then  be  used  as  a guide  for  a more  detailed 
analysis  of  the  imagery.  Lineal  networks,  such  as  roads  and  drainage  can  be 
searched  for  connecting  features  which  are  evident  in  the  imagery  but  were  either 
not  previously  present  or  were  ignored  during  base  map  compilation.  In  this 
manner  initial  manual  mapping  effort  can  be  used  to  guide  automatic  compilation 
of  detail  and  change.  Similar  to  a bootstrapping  operation,  successive  iterations 
could  yield  better  and  better  results  in  an  increasingly  automatic  mode. 

Eventually  base  maps  (or  map  data  base)  could  be  at  the  same  level  of  detail 
and  same  resolution  as  the  desired  output  product.  Quite  specific  changes  could 
then  be  sought  for  display  and  for  base  map  update.  For  example,  water  body  bound- 
aries could  be  monitored  for  change,  urban  lots  could  be  checked  for  change  in  use- 
age,  or  the  progress  of  clear-cutting  a forest  could  be  recorded.  The  underlying 
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, assumption  is  that  enough  of  the  scene  remains  constant  for  reliable  regis- 

j tration  and  calibration.  Changes  could  then  be  detected  automatically  by 

observing  differences  in  some  of  the  regions  mapped  in  the  data  base  from 
those  observed  in  the  imagery. 
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3.4  ACES  knowledge  base 


It  Is  clear  that  use  of  real  world  knowledge  will  be  necessary  for  an 
ACES  type  system.  Sections  3.2  and  3.3  have  already  introduced  mechanisms 
for  applying  a priori,  or  stored,  knowledge  to  automatic  analysis  of  aerial 
imagery.  A detailed  categorization  of  knowledge  forms  available  to  ACES  is 
attempted  in  Section  5.  At  this  point  it  is  useful  to  specify  the  two  general 
knowledge  forms  which  must  be  input  to  ACES. 

First  of  all  ACES  should  have  a declarative  knowledge  base:  this  is 
essentially  a geographic  data  base  encoded  as  data.  Present  cartographic  data 
bases  exist  in  declarative  form  such  as  GISTS  [Cook  1974]  or  DLME  [Silver  1977]. 

In  both  cases  the  knowledge  is  very  static  and  is  essentially  a digital  icon 
of  a hard  copy  display.  As  discussed  in  Section  3.3,  however,  the  static 
geographic  data  base  can  be  a powerful  source  of  locational  knowledge  when  coupled 
with  registration  and  procedural  knowledge  techniques.  Procedural  knowledge  is 
implemented  through  programs.  To  some  a distasteful  knowledge  encoding  technique, 
embodying  knowledge  in  procedure  is  sometimes  the  only  way  to  approach  specific 
or  complex  contexts  or  actions. 

Signal  prototypes  or  discriminant  function  coefficients  could  be  stored  as 
data  for  later  use  in  classification  of  land  elements.  This  is  an  example  of 
declarative  knowledge  encoding.  On  the  other  hand,  for  processing  drainage 
networks  the  knowledge  that  "streams  flow  downhill  and  intersect  other  bodies 
of  water"  is  perhaps  best  embodied  in  the  code  of  a tracking  algorithm.  Proponents 
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of  syntactic  pattern  recognition  might  argue  that  the  tracking  decisions  can 
be  encoded  as  a grammar  and  hence  as  declarative  knowledge,  but  there  is  no 
evidence  that  more  practical  results  will  be  obtained.  Knowledge  encoding  and 
use  is  a primary  research  topic  in  A.  I.  today  with  few  definitive  conclusions 
reached.  Further  discussion  is  reserved  for  Section  5. 
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3.5  Homan  knowledge  resource 


It  is  Inconceivable,  given  the  present  state  of  the  art,  that  a system 
such  as  ACES  could  be  designed  without  primary  consideration  of  human  input. 

In  many  tasks  involving  visual  recognition  and  the  use  of  inference  the  human 
Is  still  more  reliable  and  faster  than  a computer.  There  are  some  interpretive 
tasks  which  we  do  not  now  know  how  to  encode  in  terras  of  computer  operations. 
Generalization  and  artistic  rendition  of  graphical  data  are  two  such  tasks. 

Although  it  is  clear  that  human  input  is  necessary  to  ACES  it  is  not  clear 
how  that  input  will  control,  be  controlled  by,  or  cooperate  with  a computer 
system.  Some  successful  work  has  J)een  done  on  this  [Barrow  1977,  Ryan  1974]  in 
specific  circumstances  but  general  interactive  cooperation  between  man  and  machine 
has  not  yet  been  realized. 

In  typical  "automated"  cartographic  installations  the  human  performs  all 
feature  recognition  operations  at  the  point  of  data  entry  - - i.e.  during  dig- 
itization. It  is  exactly  this  laborious  task  which  we  want  to  automate  in  ACES. 
Another  typical  human  function  performed  in  today's  systems  is  that  of  editing 
cartographic  data  bases  for  error  - - error  introduced  in  the  manual  digitization 
input  process!  One  currently  feasible  mode  of  operation  for  ACES  would  be  to  have 
ACES  perform  automatic  extraction  and  digitization  and  then  present  its  results  to 
the  human  as  an  overlay  on  the  source  imagery.  The  human  can  then  edit  the  machines 
work  instead  of  his  own.  If  the  amount  of  editing  were  small  this  scheme  would 
represent  another  positive  step  toward  automatic  compilation.  In  any  case  the 
human  can  apply  whatever  knowledge  he  has  and  need  not  even  be  conscious  of  it. 
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3. 6 Input  devices  for  ACES 


The  important  conclusion  here  is  that  input  device  technology  is  far 
in  advance  of  our  knowledge  for  using  the  data  which  it  delivers.  An  exciting 
array  of  devices  are  now  available  which  offer  a range  of  design  choices. 

. There  are  off-line  devices  for  delivering  billion  bit 
representations  of  an  entire  source  document  and  there 
are  CRT  graphics  systems  which  permit  human  editing  of 
local  features. 

. There  are  devices  which  uniformly  sample  the  entire  source 
image  and  there  are  those  which  can  selectively  sample 
specific  areas  selected  by  intelligent  processes. 

. There  are  devices  which  can  organize  samples  as  2-D  or  1-D 
arrays  or  as  random  sequences  of  points. 

. There  are  devices  that  can  scan  optical  images,  electronic 
Images,  or  images  recorded  on  paper  or  film. 

. There  are  devices  that  can  vary  their  resolution  and  there 
are  devices  that  can  be  calibrated  and  controlled  to  sense 
different  spectral  bands. 
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The  three  sections  below  describe  some  available  input  devices  accord- 
ing to  whether  they  sample  the  Input  in  2-D,  1-D,  or  randomly.  This  break- 
down may  or  may  not  be  meaningful  depending  on  the  overall  philosophy  for 
analyzing  the  data  produced  as  is  discussed  in  Section  3.6.4 
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3.6.1  Matrix  input  devices 


Source  material  can  be  sampled  simultaneously  in  2-D  by  an  array  of 
photodiodes.  50x50  elements  with  1 mil  spacing  were  available  on  a single 
chip  in  1974  [Snow  1974).  Dynamic  ranges  of  1000:1  are  possible.  The  chief 
advantage  of  true  2-D  sampling  is  stable  geometry  with  totally  electronic 
sampling.  However,  most  source  imagery  will  be  larger  than  any  practical 
photodiode  array  so  other  scanning  techniques  will  still  be  necessary  to  move 
the  array  with  respect  to  the  image,  or  visa  versa.  Some  of  the  speed  gained 
by  parallel  sampling  may  be  lost  because  of  sequential  I/O  to  a digital  com- 
puter. The  arrays  have  serious  potential  if  pattern  recognition  hardware  is 
placed  between  them  and  a digital  computer. 

Another  parallel  2-D  sampling  device  is  the  ROSA  equipment  [Lukes  1978) 
which  delivers  sampled  spatial  frequency  for  a given  window  of  the  source  data. 
Its  utility  lies  in  recognition  decisions  best  made  in  the  Fourier  domain.  As 
with  photodiode  arrays,  the  window  of  data  viewed  by  ROSA  is  small  relative  to 
the  entire  source  material  and  other  scanning  techniques  must  be  used  for 
positioning . 

3.6.2  Raster  devices 

Raster  devices  organize  input  samples  into  a sequence  of  rows  of  samples, 
each  row  itself  being  a sequence  of  samples  in  time.  Thus  there  will  be  a total 
linear  ordering  induced  on  the  set  of  image  samples.  Raster  scanning  (plotting) 
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Implies  that  each  pixel  is  considered  once  and  only  once  and  hence  scan  (plot) 
time  is  constant  regardless  of  image  content.  This  is  the  chief  characteristic 
of  raster  devices  and  places  them  in  stark  contrast  to  random  linear  devices. 

Raster  scanners  can  cover  a source  area  of  120  cm  by  150  cm  at  0.0025  cm  resolution 
in  30  minutes  with  black  and  white  output  or  3 hours  with  5 bit  color/ texture  out- 
put. At  0.0025  cm  resolution  3 billion  pixels  result.  The  devices  just  described 
are  clearly  for  operation  off-line  to  data  analysis  and  not  intended  for  interaction. 
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3.6.3  Linear  input  devices 


Linear  input  devices  are  built  to  trace  linear  features  in  two  dimensions. 
Their  output  is  a collection  of  curves  which  creates  a sparser  and  more  highly 
organized  model  of  the  source  data  than  is  created  by  matrix  or  raster  scanners. 
The  most  common  equipment  type  is  the  "digitizer"  which  has  a positioning  head 
that  is  guided  over  the  features  by  a human.  The  position  of  the  head  is  known 
by  the  machine  via  mechanical  tracking  or  by  sensing  a location  on  a grid  in  the 
table  under  the  head.  While  the  operator  moves  the  head  along  the  feature  the 
machine  records  the  path  of  positions  by  sampling  in  either  time  or  distance 
intervals.  By  using  such  devices,  all  feature  recognition  is  performed  by  the 
human  before  data  is  entered  into  the  computer. 

Automatic  feature  recognition  is  possible  and  the  human  may  be  omitted  from 
the  linear  input  systems.  Research  is  continuing  "along  this  line"  including  the 
research  summarized  in  this  report.  Apparently  i/o  Metrics  has  had  a system  for 
some  time  that  can  follow  lineal  features  on  map  sheets  [Wohlmut  1974].  Once 
on  a point  of  a line,  a circular  neighborhood  around  the  point  is  scanned  to 
determine  the  next  point  of  the  line.  The  equipment  then  repositions  to  the 
new  point  found  and  continues.  An  area  of  up  to  5 ft.  by  5 ft.  can  be  scanned  with 
5 to  10  mil  increments  typically  used.  The  claim  is  that  problems  of  noise  spots 
and  intersections  are  solved.  It  must  be  emphasized  that  this  equipment  was  de- 
signed for  scanning  existing  map  sheets  and  not  imagery  so  that  the  source  data 
is  idealized  and  clean. 
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Preliminary  work  [NASA,  1977]  has  shown  t*iat  some  linear  features  can  be 
traced  in  real  time  at  the  platform.  Success  was  reported  in  the  tracking 
of  major  land-water  boundaries  using  MSS  data.  It  would  be  quite  a breakthrough 
if  performance  could  be  extended  to  many  features.  The  real-time  constraint  can 
be  dropped  and  geographic  data  base  knowledge  can  be  added  to  the  tracking  pro- 
cess. Thus  there  is  some  reason  to  be  optimistic.  The  major  payoff  in  this  kind 
of  operation  is  that  feature  recognition  is  done  at  a very  early  stage  eliminating 
the  need  for  2-D  image  storage  and  processing.  It  is  easy  to  imagine  software 
interacting  with  an  analogue  source  eliminating  large  amounts  of  redundant  inter- 
mediate digital  data  storage. 

3.6.4  Discussion  of  input  concepts 

For  a system  that  scans  off-line  and  then  processes  digital  information  in 
batch  mode  the  previous  distinctions  made  for  input  devices  are  irrelevant.  All 
of  the  source  data  must  be  scanned  because  the  analysis  necessary  to  confidently 
ignore  parts  of  the  data  is  decoupled  from  the  input  operation.  Similarly  there 
is  no  meaningful  control  over  resolution.  The  time  sequence  and  organization  by 
which  samples  were  taken  is  likewise  not  meaningful  - - data  storage  organization, 
quite  a different  thing,  is  of  prime  concern.  Such  off-line  systems  are  typical 
of  the  day.  They  impose  more  constraints  and  allow  fewer  decisions  and  as  a 
consequence  simplify  in  a way  the  difficult  task  of  image  analysis. 
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The  most  obvious  consequence  of  the  decoupling  philosophy  described  above 


is  that  systems  have  choked  themselves  on  oversampled  data.  Resolution  has  to 
be  determined  by  the  finest  feature  to  ba  identified.  Thus  regions  which  do 
not  really  have  to  be  scanned  at  all  are  scanned,  and  are  scanned  at  fine 
resolution.  Only  manual  digitization  avoids  all  this.  The  more  subtle  conse- 
quence is  the  loss  of  dynamic  control  over  the  sensing  device,  i.e.  over  thres- 
holds. Single  calibration  of  a sensor  for  an  image  of  a varied  scene  can  cause 
serious  problems.  Loss  of  contrast  and  corresponding  requirements  for  digital 
enhancement  are  common. 

The  dynamic  control  of  the  sensing  environment,  i.e.  position,  resolution, 
and  calibration,  by  an  intelligent  process  offers  the  promise  of  a smaller  volume 
of  higher  quality  data  and  less  execution  cost.  There  is  also  the  capability  of 
viewing  the  same  data  in  different  ways  by  re-examining  it  in  several  stages  of 
a feedback  loop  driven  by  the  state  of  knowledge  gathered  about  the  data.  Such  a 
purposive  system  would  have  to  contain  a rich  knowledge  base  to  support  its  deci- 
sion-making. Judging  by  the  size  of  current  geographic  data  bases  amd  encodings 
of  application  specific  knowledge,  the  process  that  intelligently  takes  its  input 
is  likely  to  be  choked  with  knowledge  instead  of  data. 
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4. 


Components  of  the  digital  ACES 


Section  2 defined  the  outputs  desired  from  an  ACES  system  and  Section  3 
described  inputs  available  to  ACES  in  order  to  arrive  at  the  outputs.  This 
section  discusses  components  of  ACES  which  must  cooperate  in  this  transformation 

of  inputs  to  outputs.  There  is  much  to  do  since  raw  imagery  is  the  basic  input 

and  symbolic  graphics  are  the  basic  output. 

4.1  The  base  map  archive 

it  was  emphasized  in  Section  3 that  a rich  knowledge  base  would  be 
necessary  for  ACES  to  automatically  extract  features  from  raw  aerial  imagery. 

A geographic  or  cartographic  data  base  could  provide  for  a large  portion  of 
applicable  knowledge,  in  particular  specific  locational  knowledge  of  previously 
mapped  features.  Geographic  data  bases  exist  today  and  are  proliferating.  Current 
uses  are  not  oriented  toward  guiding  image  analysis,  so  novel  enhancements  to 

current  data  bases  should  be  expected  to  tailor  them  to  the  image  analysis  task. 

Ways  of  augmenting  current  data  bases  are  discussed  below. 


For  compactness  and  high  symbolic  representation,  it  seems  best  to  archive 
(map)  only  the  lineal  and  point  features  of  the  earth.  Regions  are  defined  by  a 
lineal  encoding  of  their  boundaries.  The  basic  element  of  the  geographic  data 
base  is  the  lineal  segment  (edge)  which  is  characterized  by  beginning  point 
(xb,yb),  ending  point  (xe,ye),  and  3 feature  codes  FL,FO,  and  FR  which  specify 
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the  region  feature  to  the  left,  the  lineal  feature  itself,  and  the  region  to 
the  right.  Lineal  segments  intersect  at  points  called  nodes  or  vertices  and 
the  set  of  segments  together  with  the  set  of  vertices  yield  a graph  which 
represents  the  topology  of  an  area  of  the  earth.  This  is  the  basic  structure 
of  the  DIME  file  of  the  Bureau  of  the  Census  (Silver  1977].  To  preserve  geo- 
metric shape  as  well  as  topology  each  segment  can  be  associated  with  a chain 
of  points  from  (xb,yb)  to  (xe,ye)  [Chrisman  1974].  This  stored  chain  of  points 
is  not  only  of  value  in  the  plotting  of  a product  but  also  is  useful  for  identi- 
fying that  boundary  in  new  imagery  of  the  same  area.  If  the  boundary  segment 
is  tagged  as  unique  or  special  by  either  manual  or  automatic  means,  it  may  well 
be  used  for  the  initial  registration  of  image  to  archive.  While  FO  is  a pointer 
to  information  about  the  shape  and  tone  of  a lineal  feature,  the  pointers  FL  and 
FR  can  point  to  information  identifying  the  content  of  the  regions  on  either  side. 
More  than  just  a region  number  is  appropriate:  tonal  signature  could  be  specified 
by  means  of  a discriminant  function  to  be  used  for  recognition  of  pixels  in  that 
region.  With  such  stored  information,  the  tracking  of  lineals  in  raw  imagery 
could  be  guided  by  map  content. 

There  will  be  problems  in  using  a map  archive  to  guide  image  interpretation. 
First  of  all  the  archive  will  be  large  and  hence  on  external  storage.  Thus  it  will 
be  necessary  to  stage  processing  so  that  symbolic  archived  information  and  raw 
sensed  data  arrive  at  the  computer  together.  Secondly,  it  is  desirable  that  the 
same  map  be  applicable  to  the  analysis  of  Images  of  different  scales  and  the  general- 
ization problem  will  be  encountered.  Generalization  principles  often  require  logical 
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interpretation  contrary  to  the  physical  facts  depending  on  level  of  detail 


required.  Such  processes  will  be  very  difficult  to  do  automatically.  Third, 
the  raw  input  will  be  in  raster  or  array  format  while  the  archived  data  will 
be  lineal  (random;  and  symbolic.  Due  to  the  large  amount  of  symbolic  and  raw 
data  to  be  compared,  both  image  and  archive  will  need  to  be  highly  structured 
and  compared  in  sorted  order.  This  implies  a great  deal  of  processing  overhead 
in  searching  and  in  I/O. 


4.2  Class  slice  or  thematic  files 

An  ACES  system  would  probably  have  to  preprocess  the  raw  aerial  imagery 
to  some  extent  in  order  to  reduce  the  data  volume  and  increase  symbolic  con- 
tent. The  most  primitive  operation  in  this  direction  would  be  to  place  sym- 
bolic labels  on  each  pixel  (x,y)  of  the  image  based  on  a vector  m(x,y)  of 
measurements  taken  over  a small  neighborhood  of  (x,y) . m is  easily  conceived 
as  a set  of  bands  in  the  MSS  case  and  could  be  combined  tonal  and  textural 
measures  in  the  B&W  case. 

To  each  pixel  (x,y,m(x,y))  of  the  preprocessed  image  can  be  assigned  class 
labels  from  a pre-deterrained  set  C = ‘cl,c2’  * » • » c ^ ^ - T!ie  label  set  is  not 

mutually  exclusive  and  the  decision  to  label  point  (x,y)  with  label  c^  could 
be  independent  of  the  decision  to  label  (x,y)  with  c^ . In  this  manner  k binary 
images  C^  » (x,y ,b^(x,y) ) could  be  generated  from  the  preprocessed  image  indepen- 
dently and  in  parallel  according  to  the  decision  rule 

b1(x,y)  = 1 iff  P(CjJm  (x,y))  >_  t 

where  Pfc^J  m(x,y))  is  the  probability  that  c^  is  the  class  label  given  measure- 
ment vector  m and  t is  some  threshold.  Binary  images  thus  created  are  called  "class 
slices"  or  "overlays".  Note  that  the  same  position  in  several  class  slices  can 
be  1,  reflecting  ambiguity  or  mixed  properties  of  (x ,y ,m(x, y) ) . Class  slices  could 
be  used  for  extraction  and  separate  processing  of  uniform  map  features  such  as  road 
networks,  vegetation,  desert,  etc.  Each  class  slice  can  be  further  processed  accord- 
ing to  specific  semantics  without  regard  for  the  processing  of  other  slices  if 
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appropriate.  A very  high  level  control  can  finally  integrate  all  k class 
slices  to  form  a consistently  Interpretable  image. 


Overlays  formed  in  this  manner  can  be  cleaned  up  by  operations  discussed 
in  Section  b of  this  report  or  can  be  passed  against  the  lineal  feature  archive 
for  massaging  or  change  detection.  When  cleaned  the  overlays  can  be  plotted 
creating  desired  map  products  and  can  be  processed  for  update  of  the  lineal 
feature  archive.  because  of  their  large  bulk  and  low  level  of  symbolization, 
overlays  probably  will  not  be  saved  permanently  after  image  analysis  is  complete. 

4.3  Elevation  matrices 

The  elevation  of  a point  (x,y)  may  be  received  as  one  measurement  m(x,y) 
which  cannot  sensibly  be  reduced  to  a binary  value.  m(x,y)  is  derived  by  using 
two  (stereo)  images  not  one.  Elevation  data  must  be  viewed  as  a finished  output 
product  of  ACES  as  well  as  a valuable  input  to  aid  in  other  image  interpretation 
tasks.  Elevation  data  for  use  by  automatic  procedures  is  probably  best  left  in 
matrix  rather  than  lineal  contour  form. 

4.4  Procedural  knowledge  routines 

Part  of  ACES  a priori  knowledge  will  be  embedded  in  procedures  or  programs. 
Procedural  knowledge  is  necessary  in  cases  where  the  declarative  knowledge,  such 
as  that  stored  as  data  in  the  map  archive,  is  insufficient  for  decision  making. 
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It  is  difficult  to  encode  as  data  the  rules  of  thumb  that  "roads  tend  to  be 
continuous  and  tend  to  intersect  other  roads  at  blunt  angles",  yet  it  is  not 
difficult  to  program  such  rules.  Such  a "road-knowledgeable"  program  is  just 
what  is  required  to  clean  up  the  road  class-slice  as  discussed  in  4.2.  Such 
procedural  knowledge  is  only  applicable  in  specific  contexts;  while  the  afore- 
mentioned program  might  also  do  well  on  drainage  features  it  would  not  apply 
to  other  overlays..  Similar  procedural  knowledge  is  applicable  to  region  type 
features.  For  example,  "water  tends  to  be  continuous  in  2-D  extent  at  points 
with  the  same  elevation". 

4.5  Utility  routines 

A great  deal  of  ACES  system  overhead  can  be  expected  in  dealing  with  image 
data  and  in  accessing  the  knowledge  base  and  temporary  overlay  files.  A large 
number  of  utility  routines  will  be  required  in  order  to  support  ACES  analysis. 

Once  feasible  decision-making  algorithms  are  devised  the  special  support  hardware 
and  software  required  will  become  evident  and  should  be  state-of-the-art. 

4.6  Process  control 

The  most  complex  part  of  ACES  will  be  the  control  mechanism  used  to  integrate 
the  raw  data  and  a priori  knowledge  in  the  interpretation  decisions.  Making  deci- 
sion with  contradictory  or  ambiguous  information  is  included  in  the  task.  Past 
A. I.  work  has  developed  several  methods  of  control,  all  of  which  seem  too  weak  and 
inflexible  for  attacking  the  image  interpretation  task  at  hand.  These  control 
methods  include  the  following. 
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finite  automata 


. semantic  networks 
. production  language 
. Bayesian  inference 
. theorem  proving/predicate  calculus 
. hierarchical  pattern  classification 

The  first  ACES  systems  will  thus  be  controlled  in  a highly  stereotyped  or 
constrained  manner,  applying  to  the  task  only  that  fraction  of  applicable  know- 
ledge that  is  practical  under  current  representational  and  decision  making 
paradigns . 
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5.  Knowledge  available  Co  ACES 


It  Is  now  obvious  to  researchers  that  automation  of  areal  image  analysis 
will  not  be  achieved  unless  large  amounts  of  a priori  real  world  knowledge  is 
available  to  analysis  procedures.  This  section  surveys  and  categorizes  know- 
ledge sources  which  support  the  interpretation  of  spatial  geographic  imagery. 

It  is  easy  to  believe  that  humans  use  all  of  these  knowledge  sources,  either 
consciously  or  unconsciously,  in  their  image  analysis  tasks.  Unfortunately, 
knowledge  available  to  a human  who  may  readily  use  it  may  be  unuseable  to  a 
machine  due  to  the  practical  problems  of  encoding  it,  accessing  it  and  syn- 
thesizing complex  decisions  from  that  and  other  knowledge  sources.  "Knowledge 
engineering"  [Feigenbaum  1977]  is  at  an  interesting  stage  of  development  but 
has  not  yet  demonstrated  the  capability  of  handling  general  image  analysis 
tasks.  Not  only  is  there  a problem  of  combining  knowledge  in  complex  and 
possibly  ambiguous  situations  but  there  is  also  the  problem  of  efficiency  in 
case  a logical  solution  exists.  Computers  have  neither  the  complex  decision 
control  strategies  of  the  human  nor  the  vast  memory  resource.  It  is  therefore 
necessary  at  this  point  in  time  for  scene  analysis  researchers  to  carefully  select 
and  define  knowledge  sources  useful  for  implementation  and  to  carefully  test  their 
performance  capabilities. 
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5.1  Spectral  or  tonal  knowledge 


The  reflectance  from  a particular  material  should  remain  relatively 
constant  from  one  observation  time  to  another  when  viewed  with  the  same 
sensor.  There  will  be  some  alterations  due  to  changes  in  the  atmosphere 
or  sensor  electronics  or  due  to  changes  in  the  material  itself  such  as 
maturing  of  foilage.  Catalogues  of  signals  can  be  compiled  showing  typical 
reflectances  received  by  different  sensors.  With  Landsat  data  each  sensor 
samples  a specific  band  of  the  electromagnetic  spectrum.  Use  of  several 
reflectances  can  usually  narrow  the  classification  of  an  unknown  ground 
element  to  a few  possible  materials.  Much  cataloguing  has  already  been 
done  at  the  ERIM  laboratory  [ ERIM  1975  J.  Vincent  [1973]  has  presented 
a technique  using  the  ratios  of  different  spectra  to  achieve  classification 
which  is  less  sensitive  to  atmospheric  and  sensor  changes  than  using  absolute 
signal  levels.  Thus  with  MSS  data;approximate  pixel  classification  is  easy 
to  get  in  terms  of  theory  and  computation.  Higher  level  knowledge  is  necessary 
in  order  to  refine  results  of  spectral  classification.  With  black  and  white 
imagery  (B&W)  only  shades  of  grey  are  available  at  the  pixel  level  and  thus 
little  classification  information  is  available  from  tone  alone.  Connected  areas 
of  pixels  of  the  same  tone  can  sometimes  be  amassed  providing  regions  which 
can  be  classified  according  to  their  size  and  shape.  This  implies  use  of  other 
types  of  knowledge. 
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5.2  Spatial,  structural,  or  geometric  knowledge 


Most  geographic  entities  are  partly  characterized  by  their  2-dimensional 
extent  in  the  image.  Features  are  usually  continuous  and  large  in  size 
relative  to  the  resolution  elements.  These  assumptions  allow  the  collection 
of  similar  pixels  into  contiguous  regions  or  curves.  Once  regions  or  curves 
exist,  geometric  features  can  be  used  for  classification. 

5.2.1  Neighborhood  dependence 

Since  observed  regions  are  assumed  to  be  large  relative  to  pixel  size, 
the  probabalistic  interpretation  of  adjacent  pixels  must  be  done  jointly  rather 
than  independently.  One  way  to  do  this  is  to  allow  the  neighbors  of  a pixel 
to  condition  the  probability  of  classifying  a pixel  in  any  primitive  class. 
Another  way  would  be  to  interpret  the  class  of  a pixel  to  be  as  the  majority 
of  preliminary  classifications  of  neighbors  without  use  of  context. 

5.2.2  Connectedness 

Region  or  curve  features  are  imaged  onto  connected  sets  of  pixels. 
Connectivity  may  be  violated  due  to  noise  especially  in  the  case  of  curves. 
Tracking  connected  sets  of  pixels  which  have  similar  image  features  is  an 
easy  job  for  the  computer  because  connectivity  can  be  checked  by  only  local 
operations.  Tracking  of  connected  objects  can  also  be  adjusted  to  allow  for 
some  noise  distortions. 
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5.2.3  Shape  and  size 


Once  regions  and  curves  of  connected  sets  of  pixels  having  homogeneous 
properties  are  assembled,  these  objects  can  be  interpreted  by  size  and  shape 
features.  (Many  size  and  shape  features  can  be  computed  while  the  connected 
objects  are  being  tracked.  See  [Agrawala  1977]  ).  Thin  smoothly  curving 
objects  are  likely  to  be  streams  or  roads  according  to  a real  world  model. 

Long,  thin,  straight  obi ects  are  almost  surely  roads.  Large  rectangular 
objects  are  likely  to  be  housing  blocks  or  agricultural  fields. 

5.3  Spatial  distribution 

Often  the  interpretation  of  an  object  will  depend  upon  how  it  relates 
to  other  objects  in  its  proximity.  This  neighborhood  dependence  is  more 
general  and  higher  level  than  that  discussed  in  5.2.1.  If  several  poloygonal 
objects  of  a certain  size  are  detected  in  close  proximity  to  each  other,  the 
entire  set  may  be  interpreted  as  buildings.  Nearby  road  features  would  further 
support  such  an  interpretation. 

5. A Associative/semantic  knowledge 

Perhaps  at  the  highest  level  is  associative  or  semantic  type  knowledge  which 
allows  image  features  and  objects  to  be  interpreted  or  understood  with  respect  to 
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a global  model  which  is  causal  or  relational  in  nature.  This  requires 
detailed  understanding  of  the  functions  or  purposes  of  objects  and  relation- 
ships among  them  which  are  not  apparent  in  the  imagery.  For  example,  we  have 
the  general  knowledge  chunk  "evergreens  grow  on  steep  rocky  slopes".  This 
knowledge  relates  geological  structure  and  elevation  information  to  vegetation 
type  and  might  be  particularly  useful  in  the  interpretation  of  B&W  imagery 
where  all  three  related  items  — evergreens,  rocks,  and  steep  terrain  — could 
hot  be  simultaneously  observed.  A second  example  is  the  knowledge  that  "drainage 
matches  terrain".  The  causal  model  is  that  water  runs  downhill  only  and  its 
passage  erodes  the  ground.  This  implies  that  an  observed  stream  path  must  always 
follow  a non-increasing  sequence  of  elevations  and  that  elevations  on  the  path 
are  likely  to  be  lower  than  those  at  right  angles  off  the  path.  This  type  of 
model  can  be  made  "known"  to  an  automatic  device  by  encoding  it  into  a tracking 
procedure  specific  to  drainage  features.  Since  roads  can  travel  up  and  down  hills, 
the  same  knowledge  is  inappropriate  for  tracking  roads.  A third  example  giving 
knowledge  useful  in  detecting  roads  is  that  "roads  often  have  long  straight  seg- 
ments and  tend  to  intersect  each  other".  The  appropriate  model  is  that  roads  are 
man-made  and  connected  into  a network  to  allow  traffic  between  many  points.  In 
the  absence  of  obstacles  or  varying  terrain  the  most  efficient  path  is  along  a 
straight  line.  Such  knowledge  can  be  used  by  an  automatic  system  in  the  following 
manner.  At  a low-level  of  the  system  edge  features  can  be  aggregated  to  form 
curves.  Any  curves  of  certain  dimensions  which  have  straight  segments  would  strongly 
imply  a road  feature.  At  the  next  level  of  the  system  the  topological  relationship 
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of  connection  would  be  observed.  Any  connection  of  degree  3 or  4 of  two 


curves,  each  having  a straight  segment,  would  further  support  the  road 
interpretation.  Gaps  in  curves  of  an  assembled  network  could  be  checked 
more  carefully  for  edge  content  and  for  geometrical  alignment  and  the  gap 
filled  if  appropriate  results  are  obtained. 

5.5  A priori  positional  knowledge 

The  most  specific,  and  perhaps  the  most  useful  knowledge  to  an  automatic 
system,  is  positional  knowledge  of  features  as  stored  in  a geographic  data 
base  (GDB) . The  points  along  linear  features  recorded  in  the  data  base  are 
easily  used  for  interpreting  imagery  registered  to  the  data  base.  For  instance, 
the  forks  of  the  Shenandoah  River  at  Front  Royal  would  certainly  be  encoded  in 
a GDB  of  that  area  and  would  be  readily  used  in  interpreting  the  drainage  versus 
road  network  in  imagery  of  the  area.  Encoded  road  features,  on  the  other  hand, 
would  be  useful  in  tracking  the  full  road  network  after  initial  registration  with 
points  of  the  image.  The  continuity  of  the  iconic  features  in  the  GDB  would  be 
invaluable  in  efforts  to  overcome  the  fragmenting  effects  of  shadows,  poor  contrast, 
or  vegetation  canopy.  Presence  of  roads  would  lend  power  to  the  interpretation  of 
building-like  objects  detected  in  the  imagery  but  not  present  in  the  GDB.  Presence 
of  streams  would  aid  in  the  search  for  new  bridges  and  hence  new  roads. 

A priori  knowledge  need  not  consist  of  only  iconic  or  geometric  feature  type 
knowledge  but  can  also  be  very  general.  For  instance,  given  imagery  over  the  state 
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of  Maryland,  it  is  known  that  there  will  be  no  glaciers  and  no  large  lakes. 


This  knowledge  may  enable  the  isolation  of  large  areas  of  cloud  cover  in  the 
imagery.  Together  with  the  known  time  of  the  imagery  it  could  also  be  known 
exactly  which  land  classes  would  be  present  for  interpretation.  For  example, 
for  March  imagery  over  Garret  County,  Maryland  snow  cover  would  be  highly 
likely  while  vegetation  would  be  much  suppressed.  With  MSS  data,  classifiers 
(class-slices)  for  snow  and  conifer  could  be  used  but  classifiers  for  corn  or 
tobacco  couid  not  be  used. 

5.6  Use  and  encoding  of  knowledge 

The  previous  sections  discussed  various  categories  of  knowledge  available 
in  image  interpretation  and  gave  some  specific  examples.  It  is  appropriate  to 
examine  in  more  detail  exactly  how  such  knowledge  can  be  encoded  and  used  for 
automatic  interpretation.  Knowledge  can  be  encoded  in  a dec larative  form,  mean- 
ing that  it  is  encoded  as  data  to  be  used  by  a uniform  decision  procedure.  A 
uniform  procedure  is  one  that  behaves  the  same  way  for  all  feature  classes. 
Spectral  knowledge  is  easily  encoded  in  this  form  — discriminant  functions 
(specified  by  coefficients)  or  signature  prototypes  can  be  used  to  define  classes 
of  data  and  the  same  decision  procedure  can  be  applied  for  all  classes  of  data. 
Knowledge  can  also  be  encoded  in  procedural  form,  meaning  a program  is  written 
to  implement  the  knowledge.  Very  high  level  feature  specific  knowledge  is  perhaps 
best  implemented  in  procedural  form.  Declarative  and  procedural  implementations 
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each  have  their  respective  advantages  and  disadvantages  — for  an  Interesting 
discussion  see  [Winston  197 7 J . Table  5.1  relates  the  several  knowledge  sources 
previously  discussed  to  methods  of  encoding  and  using  that  knowledge  in  a computer. 


Table  5.1  Encoding  and  using  knowledge 


Category  of  knowledge 

Encoding  of  knowledge 

Use  in  interpretation 

Spectral/ tonal 

(declarative) 
prototypes,  discriminant 
functions,  density  functions 

! — 

use  to  classify  single 
pixels  by  estimating 
p(given  class j spectral  info.) 

Spat ial/neighborhood 

(declarative) 
extend  above  to  work 
on  joint  information 

use  to  classify  single  pixels  by 
p(given  class | spectral  info, 
of  neighborhood) 

(procedural) 
use  connectiveness 
property  in  tracking 
procedures 

use  to  aserable  larger  objects 
such  as  regions  and  curves 

Spat ial/ geometric 

(declarative) 
set  of  possible  labeled 
features  each  with  list 
of  qualifying  properties 

uniform  procedure  assigns 
labels  to  image  objects  by 
checking  properties 

(procedural) 
set  of  possible  labeled 
features  each  with 
qualifying  procedural 
logic 

key  properties  causes 
control  mechanism  to  evoke 
specific  procedure  to 
recognize  specific  feature 
from  properties 

Spatial  distribution 

(declarative) 

syntactic  approach  possible  but 
not  recommended  (procedural) 
set  of  procedures  for 
interpreting  cartographic 
features,  one  procedure 
for  each  feature 

interpret  individual  objects 
by  considering  a set  of  objects 
and  their  properties  in  an 
attempt  to  assign  consistent 
interpretations  to  all 
according  to  spatial 
relationships 

L 

Assoc iat ive/ semantic 

(declarative) 
syntactic  or  deductive 
approaches  possible  but  not 
recommended  (procedural) 
set  of  procedures  for 
interpreting  cartographic 
features,  one  procedure 
for  each  feature  or  group  of 
associated  features 

use  to  clean  up  networks  of 
curves  and  regions.  Use  to 
perform  top-down  search  for 
faint  features  connecting  to 
a network  or  faint  features 
enclosed  in  regions.  Perform 
more  specific  interpretations 
of  objects  according  to 
arbitrary  combinations  of 
current  information. 
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5.7  Discussion  of  knowledge  sources  useful  for  ACES 


Several  of  the  knowledge  sources  described  above  can  readily  be  applied 
to  automatic  image  interpretation  while  several  seem  unuseable  given  the 
current  state  of  the  art.  Spectral  or  tonal  information  is  easily  and  effi- 
ciently applied.  The  assumption  is  that  the  material  content  of  an  element 
of  the  earth's  surface  is  easily  classified  from  the  spectral  characteristics 
of  its  reflectance.  Spectral,  or  MSS,  classification  is  in  fact  computationally 
easy  as  is  evidenced  by  the  plethora  of  implementations  and  experiments  reported. 
Classification  performance  has  not  always  been  satisfactory,  however,  largely 
because  of  the  "signature  extension"  problem  - - that  of  classifying  data  gathered 
under  circumstances  different  from  those  existing  for  training.  The  beauty  of 
MSS  classification  is  that  it  interprets  geographic  image  data  at  the  lowest  level 
without  any  spatial  context  or  complexity  of  decision.  When  viewed  as  a final 
process,  MSS  classification  is  error-prone  and  insufficient  for  structural  inter- 
pretation of  an  image;  when  viewed  as  a low-level  data  interpretation  and  reduction 
step,  MSS  classification  could  be  regarded  as  a necessary  and  valuable  step  in 
image  analysis. 

Properties  such  as  connectedness  or  adjacency  are  also  easily  handled  by  a 
computer  because  of  local  operation.  Decisions  made  at  a point  in  the  space  depend 
only  on  very  limited  context  around  that  point.  If  natural  geographic  regions  in 
fact  image  to  many  resolution  elements  it  is  clear  that  limited  context  decisions 
can  significantly  improve  the  performance  of  interpretation  [ Welch  1971  ] . 
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Spatial  texture  features  are  more  difficult  to  apply  because  there  is  no 
uniform  neighborhood  over  which  to  compute  them.  Gramenopoulos  [1973)  and 
Haralick  [1973  j have  reported  good  results  with  texture  features  computed 
over  fixed  windows  for  land  use  classification.  However,  caution  must  be 
used  in  evaluation  of  these  results  because  the  mathematical  models  used  to 
capture  texture  seem  very  weak  when  compared  to  human  perception  of  texture 
in  analysis  of  black  and  white  photography.  Black  and  white  photography 
appears  to  be  particularly  unsuited  for  automatic  feature  extraction  because 
of  this  dependence  on  texture  perception  over  variable-sized  regions  and 
almost  complete  unavailability  of  tonal  cues. 

Associative/semantic  type  knowledge  is  also  difficult  to  apply  in  auto- 
matic processing  because  of  many  factors:  1)  it  tends  to  be  highly  specific, 

2)  it  is  difficult  to  determine  exactly  which  contexts ellicit  its  use,  and  3) 
the  output  product  of  using  the  knowledge  is  hard  to  define  In  terms  of  inputs. 
Consider  these  points  in  application  of  the  knowledge  that  "drainage  matches 
elevations".  Unfortunately  definitive  use  of  this  "matching"  would  probably 
result  in  a procedural  implementation  of  the  knowledge  as  discussed  in  Section  5.6. 

Primitive  shape  and  size  features  are  very  important  for  automatic  pro- 
cessing because  they  are  easy  to  extract  and  should  be  more  reliable  interpre- 
tive cues  than  single  tonal  samples.  The  aggregation  of  similar  tones  or  gradi- 
ent into  straight  or  highly  curving  arcs,  for  example,  can  significantly  reduce 
the  effects  of  noise,  yield  higher  level  Interpretation,  and  enable  registration 
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with  structures  in  an  existing  map.  Knowledge  of  shape  alone  is  insufficient 
to  distinguish  between  a road  and  a river  and  it  is  clear  that  other  knowledge 
must  be  Invoked  to  make  a distinction.  At  the  lower  end,  spectral  information 
and  knowledge  would  easily  do  the  job  and  at  the  higher  end  "matching"  with 
elevation  information  could  do  it,  but  with  somewhat  more  calculation. 

Although  very  specific,  a priori  positional  knowledge  is  easy  to  use  and 
will  be  invaluable  in  automatic  image  analysis.  A priori  positional  knowledge 
is  already  used  in  semiautomatic  systems  to  establish  the  registration  of  image 
points  to  points  in  a standard  coordinate  system.  With  registration,  base  maps 
become  highly  organized  knowledge  sources  useful  for  the  interpretation  of  new 
imagery.  The  base  map  could  be  regarded  as  an  organized  summary  of  the  previous 
analysis  of  the  same  area.  Registration  allows  previous  interpretation,  whether 
human  or  automatic,  to  guide  future  interpretation.  There  are  many  specific 
techniques  to  consider;  such  as  verifying  the  presence  of  bridges  over  drainage 
or  scanning  known  parking  lots  to  count  vehicles.  It  is  vital  to  recognize  the 
general  importance  of  the  role  of  registration  in  applying  existing  knowledge 
stored  in  a geographic  information  system.  Some  general  capabilities  are  as 
follows . 

1)  For  assessment  of  change  significant  features  of  an  image  must  be 
compared  with  significant  features  of  a map.  Identical  features 
should  of  course  correspond  positionally  via  registration.  Signi- 
ficant features  are  those  detected  after  some  structural  combination 
of  tonal  elements. 
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2)  For  calibration  of  the  sensor  it  might  be  necessary  to  compare  regions 
of  the  sensed  data  with  known  regions  in  the  data  base.  As  an  example 


consider  the  case  where  a lake  is  known  to  exist  in  the  area.  After 
registration,  tonal  samples  could  be  taken  from  the  image  in  a region 
known  to  be  lake.  The  calibrated  water  signal  could  then  be  used  to 
detect  other  water  bodies  in  the  image. 

3)  For  multidate  feature  extraction,  registration  is  absolutely  essential. 
By  correlation  of  positions  in  images  taken  during  different  seasons, 
greater  information  should  result  as  opposed  to  single  date  imagery. 

In  this  way,  for  example,  deciduous  and  coniferous  forest  might  easily 
be  distinguished,  from  each  other  and  from  other  land  classes. 
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6.  Automatic  recognition  of  features 


Aerial  imagery  has  been  used  as  input  to  map  compilation 
since  before  the  advent  of  the  airplane.  It  has  been  estimated 
that  perhaps  80%  of  all  map  features  can  be  extracted  from 
photographs  (Doyle,  1973).  Cartographic  features  can  thus  be 
put  in  3 categories — 1)  features  that  are  not  apparent  in 
aerial  imagery,  2)  features  that  are  apparent  in  aerial  imagery 
but  cannot  be  easily  extracted  automatically,  and  3)  features 
which  can  be  easily  extracted  from  aerial  imagery.  Existence 
of  category  1 precludes  complete  automation  of  complete  maps. 
However,  accommodation  for  category  1 by  any  semi-automatic 
system  easily  provides  a way  out  for  category  2 problems.  The 
relative  size  of  category  3 is  not  known.  Experiments  in 
automatic  image  interpretation  are  continuing  to  make  progress. 
One  of  the  greatest  stumbling  blocks  to  this  progress  is  the 
lack  of  available  data  at  the  resolution  necessary  to  classify 
features.  For  example,  researchers  (Li,  1976;  Bajcsy,  1976; 
VanderBrug,  1976)  have  been  struggling  with  the  mapping  of  roads 
using  the  insufficient  ground  resolution  of  ERTS  imagery. 

Cartographic  features  which  are  certainly  not  available 
from  aerial  imagery  include  political  boundaries,  place  names, 
building  functions  (i.e.  church,  general's  headquarters,  etc.), 
past  features  (i.e.  a forest  removed  2 years  ago),  and  subsurface 


features  (i.e.  rock  formations,  mines,  pipelines).  If  desired 
on  a map  these  features  must  be  gathered  by  other  collection 
techniques  and  positionally  merged  with  features  extracted  from 
imagery.  Cartographic  features  which  are  apparent  in  imagery 
and  might  be  extracted  automatically  include  elevation  data, 
soil  regions,  vegetation  regions,  urban  regions,  water  bodies, 
water  networks,  and  road  networks.  Further  breakdown  of  these 
classes  is  possible.  For  example,  vegetation  regions  include 
forest,  crop,  scrub,  and  park.  Further  discussion  of  map 
feature  classification  is  reserved  for  the  next  section. 

If  there  is  a 1 to  1 correspondence  between  ground  resolution 
elements  and  map  resolution  elements,  a thematic  map  can  be 
produced  by  simply  printing  the  map  element  in  a color  (theme) 
coding  the  land  class  of  the  ground  resolution  element.  For 
example,  resolution  elements  inside  a lake  can  map  into  blue 
squares  on  the  map,  elements  inside  a forest  ^an  map  into  green 
squares,  etc.  The  color  code  need  not  be  natural — roads  could 
be  red  for  instance!  The  color  code  for  the  land  element  is 
chosen  by  classification  logic  acting  on  the  spectral  signature 
of  the  ground  resolution  element.  While  no  human  would  be  so 
foolish  as  to  do  this,  an  automatic  device  is  likely  to  require 
and  succeed  at  such  a simple  1 to  1 coding  of  an  image.  The 
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machine  has  decided  advantages  over  the  human  in  spectral  and 
multispectral  classification  in  the  absence  of  spatial  infor- 
mation. A map  produced  by  a 1 to  1 coding  of  resolution 
elements  will  be  called  a thematic  map . Thematic  maps  have 
been  produced  for  several  years  from  ERTS  imagery.  They  can 
be  regarded  not  only  as  finished  products  but  also  as  input  to 
further  map  compilation.  Although  often  quite  rough  when  created 
as  described  above,  thematic  maps  are  quite  useful  for  human 
consumption  because  the  huh.  interpreter  can  smooth  and  adjust 
the  results  during  his  interpretation.  Additional  processing 
is  necessary  to  convert  thematic  maps  to  symbolic  maps  meeting 
cartographic  standards.  Further  processing  of  thematic  maps 
is  covered  in  section  6.4. 

When  we  speak  of  the  automatic  recognition  of  "cartographic  features"  we 
are  refering  to  features  symbolized  on  a map  such  as  roads  and  streams.  In 
order  to  automatically  extract  cartographic  features  from  aerial  imagery 
algorithms  must  ultimately  measure  "image  features"  which  may  be  quite  difer- 
ent  from  cartographic  features.  Traditionally,  image  features  have  been 
measured  for  each  pixel  (x,y)  during  preprocessing.  What  image  features  are 
available  certainly  depends  upon  whether  there  is  single  sensor  or  MSS  data. 
With  MSS  data  there  are  several  registered  tonal  samples  for  each  pixel;  often 
enough  to  yield  image  features  meaningful  for  compilation  of  cartographic 
features.  Besides  tonal  measurements  on  (x,y)  there  are  measurements  which 
relate  to  the  texture  or  2-D  structure  of  a local  neighborhood  of  (x,y) . 
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Together  the  tonal,  structural,  and  textural  measurements  at  (x,y)  could 
be  used  to  Include  (x,y)  as  a point  In  some  cartographic  feature.  Use  of 
MSS  imagery  for  automatic  cartographic  compilation  is  treated  in  Section  6.3. 
The  case  of  B&W  imagery  is  taken  up  in  6.4. 
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Classes  of  data 


b . 1 

Interesting  classes  to  be  coded  on  a map  were 
discussed  above  and  include  soil,  vegetation,  water,  etc.  There 
are  problems  with  land  classification  which  do  not  become  clear 
until  automatic  classification  is  considered.  There  is  no 
worldwide  standard  for  soil  classification  (McRae,  1976)  and 
vegetation  classification:  therefore,  general  a priori 
catalogues  of  map  features  are  of  dubious  value  and  intent. 
Moreover,  map  makers  have  relied  on  tradition  and  subjective 
judgment  in  their  mapping.  Two  points  become  critical  when 
automatic  mapping  is  considered.  First  of  all,  objective 
criteria  must  be  used  for  land  classification.  Difficulty  in 
matching  objective  criteria  of  a machine  to  subjective  criteria 
of  humans  using  its  product  is  evident  from  the  confusion 
matrices  gotten  in  land  use  classification  experiments.  Secondly, 
if  we  are  really  interested  in  (fast,  economical,  effortless) 
automatic  classification  it  may  be  necessary  to  choose  the 
classes  in  order  to  optimize  the  machine's  performance.  This 
could  mean  using  a posteriori  labels,  i.e.  clustering,  and 
making  the  human  consumer  adapt  himself  to  the  product,  or 
using  a priori  labels,  but  only  for  a class  hierarchy  that  is 
known  to  be  separable.  Some  major  classes  are  easily  distin- 
guished, i.e.  bodies  of  water  in  the  infrared  band,  but  fine 
subdivision  can  be  difficult,  such  as  distinguishing  among 
10  different  crops. 

Past  research  seems  to  indicate  that  successful  thematic 
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mapping  can  be  achieved  with  the  4 classes  of  urban,  bare  soil, 
water,  and  vegetation  (Gramenopoulos , 1973)  and  that  up  to  20 
different  classes  may  be  practical  (Anderson,  1971). 

Note  that  20  different  classes  in  a thematic  map  could  yield 
many  more,  and  perhaps  an  acceptable  number  of,  symbolic  map 
classes.  This  is  because  several  classes  of  the  same  road 
theme  could  be  separated  by  size,  rivers  could  be  separated 
from  lakes,  forests  from  parks,  etc.  In  addition,  spatial 
compositions  of  the  primitive  themes  could  yield  unique  map 
symbology,  i.e.  a pattern  of  buildings,  roads,  and  trees  could 
yield  a region  symbolizing  a town.  Table  6.1  gives  a list  of 
thematic  classes  for  which  we  might  have  some  hope  of  successful 
separation  and  table  6.3  gives  a list  of  map  symbols  derivable 
from  them.  It  must  be  emphasized  that  tables  6.1  and  6.2 
represent  preliminary  suggestions  and  not  final  conclusions. 

With  MSS  data  primitives  themes  are  used  to  classify  resolution  sized 
pixels  only.  Only  the  signature  of  the  pixel  itself  or  the 
signatures  of  immediate  neighbors  will  be  used  to  make  the 
classification  decision.  Spatial  structure,  shape,  or  texture 
therefore  cannot  be  used  as  features  for  classifying  single 
pixels.  There  has  been  some  problem  with  this  in  the  past, 
largely  because  ERTS  resolution  elements  are  about  an  acre  in 
size  and  could  easily  contain  several  primitive  land  classifica- 
tion themes  as  given  in  table  6.1.  Ground  resolution  and  map 
symbology  must  be  appropriately  matched.  For  recognition  of 
roads  a ground  resolution  of  10  m or  better  is  necessary.  At 
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this  resolution  "lake"  or  "forest"  cannot  be  used  as  primitive 
themes.  In  order  to  recognize  a forest  a large  extent  of  tree 
pixels  would  have  to  be  first  recognized.  Aggregation  of 
primitive  themes  into  regional  units  of  information  is  a major 
function  of  map  making.  Curves  of  asphalt  or  concrete  pixels 
need  to  be  collected  in  order  to  recognize  and  symbolize  a 
road.  When  the  map  is  printed  or  stored,  for  most  map  scales, 
trees  lining  the  mid  strip  of  a divided  highway  will  be 
suppressed  rather  than  symbolized.  Similarly,  a small  pond 
within  a forest,  even  if  several  pixels  in  extent,  might  be 
suppressed . 

Table  6.2  gives  some  of  the  knowledge  sources  that  are 

useful  in  interpreting  arrays  of  thematically  classified  pixels 

for  producing  map  symbology.  Table  6.3  contains  map  symbology 

that  may  be  inferred  from  aggregation  or  composition  of  primitive 

themes  (table  6.1)  using  the  knowledge  sources  (table  6.2). 

Table  6.4  gives  an  example  of  primitive  themes  and  map  symbology  that  might 
be  useful  in  compilation  of  1:100,000  maps  of  Maryland. 

Roads  and  drains  are  gotten  by  1-D  aggregation  of  appropriate 
primitive  theme  pixels.  Lakes  and  forests  are  gotten  by  2-D 
aggregation  of  regions  of  un i form  themes . Other  map  regions 
must  be  gotten  by  extension  of  dissimilar  themes.  For  example 
an  urban  region  is  composed  of  appropriately  textured  or 
distributed  pixels  of  all  primitive  themes  while  a swamp  is  a 
region  textured  appropriately  with  water  and  vegetation  themes. 
Higher  level  processing  of  thematic  information  is  treated  in 
section  6.J.3. 
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Table 


1. 


2. 


3. 


4 . 


5. 


6.1  Thematic  Classes 


Primitive  Themes 


Possible  Sub themes 


Water 

1.1  Cloud 

1 . 2 Snow  or  ice 

1.3  Liquid 

Bare  Rock  or  Soil 

2 . 1 Sandstone 

2.2  Salt 

2 . 3 Gran i te 

2.4  Limestone 

2 . 5 Loam 


Vegetat ion 


3 . 1 Tree 

3.1.1  Dec iduous 

3.1.2  Conifer 

3.2  Bush 

3.3  Groundcover 

3.3.1  Grass 

3.3.2  Wheat 


Man-Fabricated  Material 

4 . 1 Aspha 1 t 

4.2  Metal 

4 . 3 Concrete 

4 . 4 Wood 


Elevation 
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Table  6.2 


Knowledge  Sources 


1.  Spatial 

1.1  neighborhood 

1.2  statistical  co-occurrence 

1 . 3 texture 

2.  Structural 

2.1  connectivity 

2.2  shape 

2.3  continuity 

3.  Semantic 

3.1  drainage  matches  elevations 

3.2  streets  tend  to  be  perpendicular 


4.  Climatic 

4.1  t ime  of  year 

4.2  precipitation 

4.3  temperature 

5.  Positional 

5.1  previous  maps 

5.2  specific  regional  information 


l 
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Table  6 .3  Map  Symbology 


► 


Aggregations  of  Primitive  Themes 


Water 

Cloud 

Snow  or  ice 
Drainage 
Storage  body 
Intermittent  drainage 


(Tl.l) 

(T1.2*&(K4,K5)** 
(T1.3)&(K2,K3,K4,K5) 
(T1.3)8t(K2,K3,K4,K5) 
(T1 . 3 )&( K2 , K3 , K4 ,K5) 


Forest 

Cropland 

Elevation 

Roads  & RRs 

Building 


( T3 . 1 )&( K2 , K3 , K5  ) 

(T3. 3 )&( K2 ,K3 ,K5) 

( T5 )&(  K2 ) 

( T4 . 1 , T4 . 3 , T2 )&(  K2 , K3 , K5 ) 
(T4)&(K1,K2,K3,K5) 


I 


Compositions  of 

Urban 

Residential 

Industrial 

Swamp 

Desert 

Savannah 

Alpine 

Beach 


Primitive  Themes 

(Tl-5 )&( Kl-5 ) 

( T1 . 3 , T3 )&( K1 , K3 , K5 ) 
(T3.2,T2)&(K1,K3,K5) 
(T3)&(K1 ,K3,K5) 

( T3 , T5 )&( K4 , K5 ) 

(T1 , T2 )&( K3 , K5 ) 


* T refers  to  "primitive  theme"  as  in  table  6.1 

**  K refers  to  "knowledge  source"  as  in  table  6.2 


60 


L 


* 


6.2  Available  sensors 

Any  emittance  or  reflectance  from  the  earth  that  is 
transmitted  faithfully  through  the  atmosphere  can  be  used  for 
remote  sensing  of  the  earth's  properties.  Electromagnetic 
radiation  with  wavelength  range  of  0.3  microns  (p)  to  3 cm  is 
practical.  This  includes  near  ultraviolet,  visible  light, 
infrared,  radar  and  microwave.  Generally  resolving  power 
decreases  with  increasing  wavelength  but  effectiveness  in  fog 
or  cloud  cover  increases.  There  is  no  best  sensor  for  all 
recognition  tasks  and  all  resolutions.  ERTS-1  sampled  four 
bands  of  the  spectrum  from  0.5p  to  l.lp  with  a coarse  ground 
resolution  of  about  70  m.  (See  E.*tes  . 1974.)  ERTS-2  has  2 

added  bands,  one  on  each  end  of  the  set  of  bands  sampled  by 
ERTS-1.  Scanners  and  photographic  equipment  collecting  sunlight 
reflected  from  the  earth  or  thermal  radiation  emitted  are  called 
passive  sensors . Act ive  sensors  include  radar  and  lasers.  Low 
altitude  active  night  photography  can  also  be  done.  Unclassi- 
fied radar  systems  have  a ground  resolution  capability  of  10  m 
(See  Estis>  1974.),  well  within  the  range  needed  for  useful 
cartography.  Resolution  for  the  visible  bands  of  the  spectrum 
range  from  1 m to  1000  m depending  on  altitude  and  purpose  of 
col  lection . 

Table  6.5  summarizes  the  capabilities  of  some  sensors  for 
use  in  classification  of  land  features.  In  some  cases  only  one 
band  is  necessary  for  good  detection.  Water,  for  instance,  is 
readily  detected  in  the  infrared  band.  Radar  or  visible  color 
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information  can  augment  the  infrared  information  if  needed. 
Different  kinds  of  information  are  gotten  about  vegetation 
from  the  different  bands  ( Efttes,  1974).  Variations  in  pigmen- 
tation is  gotten  from  the  visible,  structural  differences  in 
spongy  mesophyll  are  seen  in  the  near  infrared,  and  moisture 
stress  is  picked  up  in  the  far  infrared.  More  progress  in 
sensor  technology  is  expected  and  research  in  classification 
is  continuing.  Progress  is  needed  in  order  to  transfer  classi- 
fication successes  over  temporal  and  locational  changes  of 
data  collection. 
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Table  6.4 


Example  for  1:100,000 

Map 

Compilation 

in  Maryland  Region 

(Rg 

= 5m ) 

Primitive  Themes 

Map  Symbology 

1 . 

C loud 

1 . 

Perennial  drainage 

2 . 

Snow  or  Ice 

2 . 

Water  bodies 

3. 

Liquid  Water 

3. 

Forest 

4 . 

Swamp 

4. 

Sand 

5. 

Beach 

5. 

Other  Exposed  Earth 

6. 

Elevation  contours 

7. 

Roads 

6. 

Tree 

8. 

Rail  roads 

7. 

Brush,  Crop,  Groundcover 

9. 

Urban,  small 

10. 

Urban 

8. 

Asphalt 

1 1 . 

Urban,  extensive 

9. 

Concrete 
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Tab  1 e 6.5 


Sensors  for  Detection  of  Primitive  Themes 


Primitive  Theme 


Sensor 


Water 

Bare  rock  or  soil 

Vegetat ion 

Man- fabricated  material 

Elevation 


infrared,  radar 

visible  color  (Rib,  1973) 
night  infrared 

spectral  ratios  (Vincent,  1973) 

visible  color 
infrared  (Estes,  1974) 

visible  color 

infrared 

radar 

visible  stereo  pairs 
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6.3  Recognition  of  cartographic  features  using  MSS  data 


The  chief  advantage  of  MSS  data  over  single  sensor  data  is  that  a large 
amount  of  information  is  available  in  tonal  image  features  at  each  point 
(x,y)  and  thus  ground  element  classifications  can  be  made  in  a logically 
and  computationally  simple  manner.  Class-slices,  or  overlays,  gotten  by 
low  level  pixel  classification  was  discussed  in  Section  4.2.  The  result 
is  that  binary  images  (x , y , b^(x ,y) ) can  easily  be  created  for  any  primitive 
cartographic  feature  c^.  The  set  of  all  such  class  slices  thus  created 
would  define  a crude  thematic  map. 

There  are  at  least  4 steps  of  refinement  necessary  in  creating  an  acceptable 
cartographic  product  from  a set  of  primitive  class  slices.  First,  very  general 
"smoothing"  operations,  such  as  hole-filling  and  noise  suppression  should  be 
performed  on  all  class-slices  independently.  Secondly,  feature  specific  pro- 
cessing algorithms  should  be  applied  to  individual  class  slices.  This  could 
include  comparison  of  the  data  with  lineal  archive  data.  Third,  combinations 
of  class  slices  should  be  considered  simultaneously  for  creation  of  composite 
features  from  primitive  ones  and  for  adjustment  of  one  feature  due  to  the 
presence  of  another.  Fourth,  class  slice  data  should  undergo  a representation 
transformation  from  array  form  to  lineal  form.  This  step  might  at  first  appear 
to  be  unnecessary  if  raster  output  is  to  be  done.  However,  symbolization  is 
inherent  in  the  conversion  from  array  to  lineal  representation.  Lines  must 


\ 
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be  thinned  to  centerlines  and  region  boundaries  must  be  tracked.  As  part  of 
the  symboiization  process,  lines  on  the  map  will  be  uniform  in  width  and 
will  be  wider  than  life  on  small  scale  maps.  These  four  steps  in  processing 
MSS  data  for  cartographic  feature  extraction  are  discussed  in  more  detail  in 
the  following  sections.  An  overall  view  of  the  processing  is  given  in  Figure  6.1. 
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Figure  6 . 1 Possible  steps  for  production  of  topographic  maps 
from  MSS  data. 


► 

\ 
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6.3.1  Enhancing  thematic  maps 


► 


Maps  for  human  consumption  require  resolution  which 
varies  according  to  the  features  symbolized.  For  example,  a 
forest  feature  need  not  be  displayed  with  the  same  accuracy 
necessary  for  a road  or  building.  However,  a practical  MSS 
scanning  system  will  produce  resolution  elements  of  the  same 
size  for  all  features.  In  order  to  correct  errors  and  suppress 
detail  thematic  maps  need  enhancements. 

For  logical  simplicity  it  is  assumed  at  this  point  that  a 
separate  (binary)  image  is  available  for  each  primitive  theme 
classified.  This  will  be  called  a class  slice . By  allowing 
several  class  slices  we  allow  a given  map  resolution  element 
to  be  associated  with  multiple  themes.  This  not  only  simplifies 
processing  but  also  aids  in  decision  making  and  in  fact  corres- 
ponds to  overlay  formation  in  current  automatic  cartographic 
techniques . 

Figure  6.2  shows  how  gaps  and  holes  can  be  filled  by  a 
single  simple  process  acting  on  each  individual  overlay.  The 
particular  example  shows  how  continuous  road  features  and  solid 
vegetation  regions  can  be  assembled  using  a growing  operation 
followed  by  a shrinking  operation.  Because  the  scanner  partitions 
its  window  into  ground  resolution  elements  and  because  the 
classification  logic  is  unlikely  to  identify  two  themes  for 
the  same  pixel,  road  pixels  will  form  holes  in  forest  regions 
and  forest  canopy  pixels  will  cause  gaps  in  road  themes.'  By 


69 


growing  the  road  class  slice,  segments  of  the  road  will  fuse 
over  the  gaps  and  by  growing  the  vegetation  region  the  holes 
caused  by  the  road  will  be  filled.  Shrinkage  of  the  same  amount 
would  then  return  the  vegetation  area  to  its  original  form 
without  the  holes  and  would  produce  a continuous  road  of  the 
same  width  as  before  but  without  gaps.  The  amount  of  hole  and 
gap  filling  performed  is  controlled  by  a growth  shrinkage 
parameter.  Results  are  not  likely  to  be  perfect  but  should 
definitely  enhance  the  original  class  slices. 

Suppression  of  isolated  detail  can  be  done  by  first 
shrinking  and  then  growing  as  shown  in  figure  6.3.  Isolated 
regions  of  a class  slice  which  are  smaller  than  the  shrinkage 
parameter  will  disappear  and  thus  not  be  restored  by  the 
subsequent  growing  operation.  On  the  other  hand,  regions  of 
a class  slice  of  diameter  larger  than  the  shrinkage  parameter 
will  be  restored  to  a "smoothed"  version  of  the  original.  Note 
that  the  narrow  inlet  I in  figure  6.3  will  shrink  away  and  not 
be  restored.  Class  slices  of  gross  areal  features,  such  as 
water  bodies  and  forests,  will  be  enhanced  by  such  processing, 
but  class  slices  of  lineal  features  such  as  road  networks  should 
not  be  processed  in  this  way. 

As  a by-product  of  growing  followed  by  shrinking,  areas 
"peppered”  with  pixels  of  one  class  will  fuse  if  the  individual 
pixels  are  within  a certain  distance  of  each  other.  Because 
of  this,  swamp  areas  which  are  textured  with  water  and 
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vegetation  elements  will  be  indicated  by  an  area  of  overlap  in 
the  vegetation  and  water  overlays.  The  swamp  is  termed  a 


composite  theme  and  can  be  detected  as  shown  in 

figure  6 .4  by  intersecting  the  enhanced  water  and  vegetation 
class  slices.  Other  composites  are  possible;  for  example,  a 
residential  area  could  be  defined  by  the  intersection  of 
asphalt  and  vegetation  class  slices. 

6.3.2  Extraction  of  hydrologic  features 

Large  water  bodies  can  be  gotten  by  shrinking  and 
growing  from  the  water  class  slice.  The  shrinking  must  be 
enough  to  suppress  all  small  drainage  features.  A drainage 
class  slice  can  then  be  created  by  logically  subtracting  the 
water  body  class  slice  from  the  water  class  slice.  The  drainage 
class  slice  could  be  partitioned  into  slices  for  large  and 
small  features  by  the  same  technique  if  required. 

For  compact  storage  or  display  by  random  point  plotting, 
the  boundaries  of  connected  regions  of  the  water  body  class 
slice  could  be  tracked  and  chain-encoded  (Freeman,  1961). 
Line-tracking  algorithms  could  form  a chain-encoded  drainage 
network  from  the  drainage  class  slice,  possibly  after  a thinning 
algorithm  is  first  applied  to  the  binary  image.  The  tracking 
algorithm  could  use  heuristics  to  fill  network  gaps  or  could 
consult  a map  data  base  when  making  extensions  of  lines  beyond 
the  MSS  evidence  recorded  in  the  class  slice.  For  instance,  a 
; t ream  that  is  lost  in  foliage  might  be  tracked  by  registering 
*:th  a symbolic  map  created  from  winter  imagery  or  by  globally 
luring  the  drainage  class  slice  and  elevation  matrix. 
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6.3.3  Roads  and  urban  areas 


Roads  and  urban  areas  can  be  handled  in  the  same 
manner  as  drainage  and  water  bodies.  If  the  imagery  is  of 
high  enough  quality,  separation  according  to  road  size  may  be 
possible  (Radosevic,  1976).  After  or  during  tracking,  the 
road  network  would  need  to  be  cleaned  up  by  assuming  continuity 
of  single  roads,  intersections  of  nearby  roads,  and  agreement 
with  past  mapping.  Urban  areas  can  be  encoded  by  tracking  the 
boundary  of  the  urban  class  slice. 

6.3.4  First  level  symbolic  map 

By  tracking  line  networks  and  region  boundaries 
the  thematic  map  information  becomes  more  highly  structured-- 
individual  pixels,  previously  processed  locally  in  parallel, 
are  now  highly  related.  The  regions  and  networks  thus  formed 
are  the  basis  for  symbolic  map  production.  Without  further 
cleanup  these  features  could  be  plotted  to  yield  a first  level 
symbolic  map.  Further  map  revision  is  discussed  in  sections  6.5 
and  6.6  below. 
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6.4  Recognition  of  cartographic  features  using  black  and  white  imagery 

Even  with  an  array  of  tonal  samples  for  each  pixel  it  will  be  some  time 
before  processing  steps  as  discussed  in  Section  6.3  could  be  made  viable.  In 
this  section  the  possibility  of  doing  automatic  analysis  with  even  less  input 
data  is  examined.  Cartographers  and  photo  interpreters  have  been  working  with 
black  and  white  aerial  photography  for  decades.  Such  imagery  is  readily  avail- 
able and  shows  fine  detail.  However,  in  order  to  interpret  B&W  imagery,  image 
features  of  a global  scale  must  be  used  at  primitive  stages  of  decision-making. 
Humans  readily  use  the  Gestalt,  or  global  character  of  a scene  to  make  unambi- 
guous local  interpretations.  Reproducing  such  a capability  in  a machine  will 
not  be  easy. 

As  an  example  of  the  difficulty  for  automatic  analysis  of  B&W  imagery, 
consider  the  case  of  interpreting  a dark  area  in  an  image  taken  near  the 
Washington  monument.  The  tone  of  pixels  could  indicate  either  water  or  asphalt 
leading  to  possible  interpretations  of  either  pond  or  parking  lot  for  the  entire 
dark  area.  Since  both  of  these  features  could  also  have  similar  textures,  it 
is  impossible  to  differentiate  locally  from  image  pixel  features.  Even  the  shape 
of  the  feature  is  insufficient  for  discrimination;  i.e.  the  reflecting  pool  is 
rectangular  as  a parking  lot  would  appear  to  be.  In  many  cases  the  context  of 
neighboring  features  would  be  useful,  but  in  this  case  the  many  concrete  paths 
leading  to  the  reflecting  pool  would  support  the  parking  lot  interpretation  more 
than  the  pond  interpretation. 
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Although  not  all  cases  are  as  subtle  as  the  case  described  above,  there 
are  many  severe  problems  in  using  8&W  imagery  in  automatic  analysis.  Distin- 
guishing between  roads  and  streams  presents  another  difficult  situation:  tone, 
size, shape,  and  topological  features  may  all  be  similar.  The  large  amount  of 
real-world  knowledge  possessed  by  trained  photo  interpreters  allow  them  to 
rapidly  arrive  at  globally  correct  interpretations  even  in  areas  for  which  they 
have  no  reference  map.  There  appear  to  be  two  viable  approaches  to  the  automatic 
interpretation  of  B&W  imagery.  The  first  approach  would  use  general  contextual 
and  relational  knowledge  to  arrive  at  a unique  consistent  labeling  of  a scene 
given  an  ambiguous  set  of  possible  labels  for  each  scene  region  or  line  feature. 
This  approach  is  studied  in  Section  6.4.1.  A far  different  approach  to  the 
problem  is  to  do  all  analysis  with  respect  to  a base  map  of  features  in  the  area 
being  image.  Map-directed  image  analysis  is  treated  in  6.4.2. 


6.4.1  Arriving  at  global  interpretation  by  propagation  of  local  constraints 

In  this  section  the  technique  of  relaxation  or  relaxation  labeling  is 
considered  for  use  in  interpreting  the  content  of  imagery.  Currently  a rage 
in  image  processing,  relaxation  was  recently  invented  by  Waltz  [1975]  and 
further  developed  by  many  others.  The  treatment  rendered  here  owes  much  to 
Tenenbaum  and  Barrow  [1976]  who  have  applied  relaxation  labelling  to  image 
analysis  problems  with  goals  similar  to  those  of  this  study. 

Relaxation  is  basically  a bottom-up  process  which  filters  through  multiple 
possibilities  allowed  by  the  extraction  of  local  information.  Possibilities 
are  thrown  out  rather  than  brought  up  as  the  propagation  process  drives  toward 
a global  interpretation  composed  of  locally  consistent  parts.  To  start  the 
procedure  it  is  assumed  that  preprocessing  has  been  used  to  extract  lineal  and 
region  type  objects  from  an  image.  The  objective  of  the  procedure  is  to  discover 
the  correct  feature  or  label  to  assign  to  each  object  in  the  segmented  image.  For 
instance,  each  lineal  object  should  be  identified  as  a stream,  road,  or  railroad 
while  each  region  should  be  identified  as  urban,  open  water,  forest,  etc.  Real 
world  knowledge  will  be  applied  to  identification  of  single  objects  by  using  feat- 
ures such  as  size  and  shape  and  to  identification  of  sets  of  objects  by  considering 
spatial  and  topological  relationships.  Encoding  real  world  constraints  for  use  by 
a uniform  procedure  is  one  of  the  chief  problems  and  is  considered  below.  Getting 
the  good  segmentation  of  an  image  that  we  have  assumed  available  is  also  a difficult 
problem  and  will  not  be  further  discussed  here.  (Zucker  [197  and  others  have  used 
relaxation  at  a lower  level  to  extract  the  objects  themselves.) 
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A simple  example  of  image  interpretation  is  developed  in  Figure  6.5  a-f . 
Possible  labels  (cartographic  features)  for  region  and  lineal  objects  are 
specified  to  the  process.  These  labels  will  be  multiply  assigned  to  each 
image  object  according  to  primitive  measurements  made  during  extraction  of 
the  objects.  For  instance  a long  thin  curve  may  be  labeled  {R,RR,S}  meaning 
that  from  information  gathered  so  far  this  object  may  be  either  a road,  rail- 
road, or  stream.  If  it  is  later  found  that  this  object  connects  to  an  object 
known  to  be  a body  of  open  water  then  the  labels  {R,RR}  are  discarded  and  the 
long,  thin,  curve  is  known  to  be  a stream  { S } . This  interpretation  might  then 
be  further  propagated  to  refine  the  interpretation  on  other  nearby  or  connecting 
objects.  Six  sample  cartographic  features  are  given  in  Figure  6.5a  and  four 
sample  topological  relationships  that  might  exist  between  such  features  are 
defined  in  Figure  6.5b.  Figure  6.5c  gives  sample  real-world  constraints  on  the 
cartographic  features.  For  example,  streams  can  connect  to  other  streams, 
3s1S2ic(S1,S2),  but  streams  cannot  penetrate  other  streams,  * si,s2  3 p(s1,s2). 
Similarly  the  encoded  constraints  also  state  that  roads  do  not  connect  to  open 
water  and  that  urban  regions  do  not  appear  inside  of  other  urban  regions. 

Figure  6.5d  shows  a sample  image  segmented  into  4 lineal  objects  and 
4 regions.  The  relationships  known  from  the  extraction  process  are  given  in 
Figure  6.5e.  All  that  is  known  about  the  objects  are  the  observed  relationships 
and  measurements,  and  that  they  must  as  a set  satisfy  the  a priori  constraints 
given  in  Figure  6.5 c.  We  suppose  that  in  this  case  enough  information  exists 
to  know  that  region  R1  is  water  W.  The  initial  state  of  labels  on  the  eight 
image  objects  is  given  in  Column  2 of  Figure  6.5f.  Seven  of  the  eight  objects 
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present  maximum  ambiguity.  During  relaxation,  the  facts  of  6.5e  are  passed 
against  the  constraints  of  6.5c  to  refine  the  sets  of  possible  labels.  For 
example,  since  water  cannot  be  penetrated,  i.e.  «jjx^p(X,W),  and  R3  and  R4 
are  penetrated  by  Ll,  then  R3  and  R4  cannot  be  water.  Since  L3  cannot  be  a 
road  R or  railroad  RR  t must  be  a stream  S.  Further  propagation  results 
in  the  final  column  of  Figure  6.5f.  A few  ambiguities  persist  and  cannot  be 
removed  without  more  knowledge.  The  paper  by  Tenenbaum  [1976]  should  be  con- 
sulted for  more  details. 

While  procedures  such  as  that  described  above  are  promising  and  are  the 
subject  of  much  current  research,  there  are  difficult  problems  to  be  faced 
before  practical  solutions  can  result.  The  foremost  problem  is  that  the 
relational  constraints  are  passive  knowledge  which  serve  to  destroy  but  not 
create.  Active  knowledge  is  probably  necessary  to  achieve  a satisfactory  seg- 
mentation on  which  to  operate.  Beyond  this,  there  is  the  problem  of  encoding 
knowledge  as  relational  constraints  to  be  somewhat  fuzzy.  Roads  often  do 
terminate  at  streams  or  open  water  so  these  features  might  well  appear  to  be 
connected  in  imagery.  Incorrect  decisions  at  beginning  levels  of  propagation 
could  produce  meaningless  interpretations. 
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Region  objects 

Lineal  objects 
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Open  water  W 

Road  R 
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Railroad  RR 

Figure  6.5a)  Possible  interpretations  for  extracted  image  objects 

c(X,Y)  X connects  to  Y where  X is 

a lineal  object  and  Y is  either 
a lineal  or  region  object 

p(X,Y)  X penetrates  Y where  X is 

a lineal  object  and  Y is  either 
a lineal  or  region  object 

i(X,Y)  region  object  X is  inside  of  region 

object  Y 

a(X,Y)  region  object  X is  adjacent  to  region 

object  Y 


Figure  6.5b)  Set  of  relations  observable  for  certain  pairs  of  uninterpreted 
image  objects. 
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Figure  6.5e)  Relationships  between  objects  of  segmented  imagery 
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Figure  6.5f)  Interactive  removal  of  ambiguities  in  object  interpretation  by  filtering 
(passing  Figure  6.5e  against  6.5c  ) 
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6.4.2  Map-directed  analysis  of  B&W  imagery 


There  are  two  severe  problems  encountered  in  the  analysis  of  B&W  imagery. 

The  first  problem  - - needing  global  consistency  in  order  to  make  local  inter- 
pretations - - was  discussed  in  the  introduction  to  this  section.  The  second 
problem  is  the  difficulty  of  extracting  meaningful  curves  and  regions  from  the 
imagery  to  use  as  cartographic  features.  Noise  and  lack  of  contrast  usually 
thwart  bottom-up  feature  extraction  procedures  and  fragmented  and  disconnected 
features  typically  result.  Top-down  analysis  done  under  the  direction  of  an 
existing  base  map  could  provide  a solution  to  both  problems. 

Knowledge  stored  in  a base  map  is  not  only  highly  specific  to  the  geometry 
and  content  of  the  area  being  imaged  but  it  is  also  easy  to  use  in  computer 
programs  because  the  knowledge  is  locational  in  nature  and  easily  registered 
to  image  positions.  It  would  thus  be  easy  to  locate  image  pixels  which  are  in 
the  middle  of  a specific  lake  or  road.  Focused  searches  could  be  done  for  any 
lineals  in  the  map  archive  which  should  be  observed  in  the  image.  In  this  manner, 
interpretations  could  be  established  for  a large  number  of  feature  fragments  of 
the  image.  Then  feature  specific  routines  could  be  used  to  operate  on  uninter- 
preted data  using  the  interpreted  fragments  as  a guide.  Roads  and  streams  must 
be  tracked  to  form  connected  networks,  for  instance.  Parking  lots  could  be  scanned 
to  count  cars,  known  crop  lands  could  be  checked  for  plowing,  etc.  In  this  manner, 
all  image  features  would  be  interpreted  against  cartographic  features  recorded  in 
the  data  base. 


83 


The  system  described  above  would  basically  be  a change  detection  system. 
Changes  in  the  shape  or  location  of  critical  features  should  alert  reporting 


processes  of  the  system.  Critical  regions  of  the  imagery  should  be  scanned 
for  the  appearance  of  features  not  mapped.  These  new  features  may  be  recent 
changes  on  the  earth  or  may  result  from  an  image  scale  that  is  larger  than 
the  map  scale.  Much  experimentation  is  necessary  to  test  the  concepts  dis- 
cussed. Future  items  for  research  are  given  in  Section  9. 
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6-5  Feature  specific  map  adjustment 

Many  idjustments  to  the  map  data  may  be  necessary 
before  printing  the  map.  Some  of  these  will  depend  on  the 
technique  for  producing  the  map  and  should  not  alter  stored 
data;  other  techniques  will  alter  the  stored  data.  For  example, 
in  the  case  of  a road  penetrating  a vegetation  window  it  may 
be  necessary  to  subtract  the  road  symbolization  from  the  window 
so  that  two  colors  of  the  map  will  not  overprint.  It  may  not 
be  necessary  to  keep  the  two  parts  of  vegetation  in  computer 
store,  however.  In  general  any  lineal  feature  (road,  drain, 
etc.)  must  be  checked  for  overlap  with  any  areal  feature 
(forest,  lake,  etc.).  Similarly  any  two  areal  features  and  any 
two  lineal  features  need  to  be  checked  for  overlap.  Two  over- 
lapping areal  features,  i.e.  urban  and  lake,  may  imply  the  need 
for  cleanup,  or  at  the  lower  level  may  require  the  creation  of 
a composite  class  slice  (as  the  case  where  large  water  and 
vegetation  overlap  indicates  swamp  area).  Cartographic  standards 
require  that  contour  lines  show  cut  and  fill  at  road  intersections 
and  show  the  gradient  of  drainage.  Contour  lines  are  themselves 
symbolic  and  are  not  apparent  in  imagery,  nor  are  the  adjust- 
ments mentioned  above.  The  symbolization  of  the  road/contour 
and  drain/contour  intersections  can  possibly  be  done  when 
tracking  in  the  elevation  matrix.  It  should  be  noted  in  passing 
that  contour  features  usually  dominate  in  the  storage  and 
plotting  considerations  of  automated  map  making  and  that 
adequate  cartographic  products  could  be  produced  more  economically 


! 
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by  bypassing  contour  adjustment.  Finally  we  must  consider  the 
printing  of  place  names,  road  numbers,  point  symbols,  etc.,  on 
the  map  and  the  possible  suppression  of  other  information.  On 
a typical  topographic  map  most  of  these  symbols  will  be  printed 
on  top  of  other  features  with  little  clutter.  Road  numbers, 
however,  are  not  often  superimposed  on  the  road  feature  but  are 
"cut  out"  of  it.  Once  again  it  should  be  noted  that  faster, 
more  economical  map  production  is  possible  by  ignoring  the 
adjustments  of  name  and  point  symbol  placement. 


6.6  Addition  of  information  not  apparent  in  imagery 

Aerial  imagery  cannot  supply  all  information  to  be 
mapped.  Political  boundaries,  pipe  lines,  air  routes,  etc., 
may  not  be  visible  in  imagery  and  may  have  to  be  entered  by 
other  means.  Such  data  can  be  entered  as  polygonal  or  Freeman 
encoded  information  by  a human  operator  tracing  over  the 
feature  or  giving  successive  points  along  it.  Point  data  from 
any  source  can  be  entered  into  the  system  by  an  operator  working 
in  either  the  output  map  coordinates  or  in  some  standard 
coordinate  system.  Coordinates  can  be  pre-assigned , for  example 
by  field  survey  techniques,  or  can  be  generated  automatically 
by  computer  when  an  operator  interacts  with  a display  of  the 
area . 
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Names  of  places  and  features  must  be  added  to  a map  as 
indicated  above.  Each  feature  identified  in  the  map  archive 
can  be  assigned  a symbolic  name,  i.e.  "Dismal  swamp",  "Buffalo 
crossroads",  "Greenwood  district",  etc.  Once  assigned  to 
features  of  a symbolic  map,  names  can  be  transplanted  to  new 
imagery  by  correlating  features  of  the  new  imagery  with  the 
archived  features.  Certain  features  can  be  tagged  as  prominent 
features  in  the  archive  and  these  can  be  used  to  establish  a 
coordinate  transformation  between  a new  image  and  an  archived 
map  (Van  Wie,  1977,  and  Barrow,  1977).  Once  a coordinate 
transformation  has  been  established  then  all  other  less 
prominent  points  can  be  directly  mapped  from  the  archived  map 
to  the  image.  Identification  of  points  in  this  manner  leaves 
the  problem  of  printing  names  on  the  map  so  that  they  do  not 
overlap  one  another. 
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7.  ACES  process  control 


Available  knowledge  for  cartographic  feature  extraction  is  discussed 
in  Section  5 while  the  extraction  of  the  features  themselves  is  covered  in 
Section  6.  In  this  section  details  of  the  process  controlling  feature  ex- 
traction and  knowledge  application  are  examined.  The  purpose  here  is  to  con- 
sider a particular  example  rather  than  to  study  several  possible  alternatives. 

A general  assessment  of  some  alternative  mechanisms  in  knowledge  engineering 
is  given  in  [Barnett  1977].  Since  the  image  interpretation  tasks  at  hand  are 
varied  and  complex,  it  is  likely  that  no  single  mechanism  offered  by  A. I.  will 
suffice  but  rather  a combination  of  several  will  be  necessary. 

Figure  7.1  presents  knowledge-based  image  interpretation  as  a simplified 
sequence  of  5 steps.  At  each  of  the  5 steps  knowledge  of  some  form  is  applied 
in  refining  the  analysis.  If  two  know] edge  sources  do  not  interact  in  any 
decision  step,  then  it  is  possible  to  implement  the  knowledge  sources  in  dif- 
ferent manners.  The  results  of  analysis  must  be  given  in  the  same  image  descrip- 
tion terms  so  that  there  is  proper  communication  between  processing  steps.  The 
general  operations  shown  in  Figure  7.1  should  behave  as  specified  in  Table  5.1. 
The  control  of  the  individual  processing  steps  themselves  can  be  very  complex 
depending  upon  the  type  of  knowledge  engineering  done.  Description  of  this  com- 
plexity is  avoided  at  this  point  in  the  research. 
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7.1  Preprocessing/general  low-level  feature  extraction 


The  first  processing  step  is  to  extract  curved  edges  which  represent 
fragments  of  lineal  features  or  region  boundaries.  The  goal  of  this  step  is 
to  provide  enough  evidence  for  registering  the  image  to  the  vast  resource  of 
positional  knowledge  stored  in  the  GDB.  Only  very  general  spatial/connectivity 
knowledge  is  used  in  this  processing  step,  making  it  very  fast.  The  curve 
fragments  are  examined  for  distinguishing  properties  such  as  having  a segment 
of  very  high  curvature  or  0 (straight)  curvature.  The  curve  fragments  are  des- 
cribed as  a string  of  points  together  with  their  special  properties  and  are 
adjoined  to  the  raw  image  data. 

7.2  Registration  of  image  data  to  the  GOB 

In  order  to  make  maximum  use  of  GDB  knowledge,  registration  is  attempted 
as  early  as  possible  in  the  ACES  process.  This  is  done  by  "correlating"  the 
curve  fragments  extracted  in  step  1 with  distinguishing  curve  segments  of  the 
GDB.  This  can  be  done  by  trying  all  pairwise  matches  between  image  curve  seg- 
ments and  GDB  curve  segments  until  a maximum  number  of  matches  can  be  explained 
by  the  same  registration  transformation.  An  implementation  of  this  matching  via 
clustering  has  been  shown  to  be  feasible  [Stockman  1978].  Curve  segments  which 
are  not  matched  under  the  registration  procedure  must  be  processed  later  in 
subsequent  interpretation  steps. 
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7.3  Curve  segment  labeling  and  object  verification 


Any  curve  segment  that  matches  a curve  segment  of  the  GDB  is  subject  to 
immediate  interpretation  and  labeling.  Because  of  its  known  position  through 
registration,  the  curve  segment  is  known  to  be  a portion  of  a stream  feature, 
road  feature,  land-water  boundary,  etc.  Other  attributes  of  the  feature  become 
known  through  association;  for  instance,  the  name  of  a road,  its  width  and 
material  composition.  Objects  which  were  completely  extracted  during  step  1 
are  now  completely  interpreted,  while  incomplete  objects  must  be  further  pro- 
cessed in  future  steps. 

7. A Extension  of  network  features 

Networks  such  as  roads  and  stream  networks  can  be  extracted  under  GDB 
guidance  by  starting  with  appropriate  curve  segments  from  the  image  and  extending 
the  curves  through  image  points  with  fainter  image  features.  An  algorithm  is 
needed  which  will  test  the  conformity  between  mapped  features  and  observed  pixel 
properties  and  will  implement  topological,  geometrical,  and  feature  specific  know- 
ledge. Mapped  features  drawing  no  support  from  image  data  should  alert  a human 
interpreter  for  change  analysis  and  possible  revision  of  the  GDB. 
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7.5  Final  change  detection 


After  step  4 most  of  the  edge  and  curve  activity  of  the  new  imagery  will 
have  been  interpreted  relative  to  the  GDB.  Further  clean-up  and  search  is 
required.  First  of  all  there  will  be  curve  segments  which  are  present  in  the 
image  but  not  the  map.  These  may  be  due  to  noise  or  artifact  or  may  indicate 
significant  objects  which  are  new  or  which  were  not  mapped  due  to  the  scale 
or  purpose  of  the  GDB.  Deletions  of  noise  or  artifact,  or  completion  and  inter- 
pretation of  real  objects  will  require  all  knowledge  sources  available  in  the 
preceding  steps  1,3  and  4.  In  particular,  an  interactive  human  user  may  be 
called  into  the  decision-making.  There  may  be  features  which  are  mapped  which 
have  no  corresponding  curve  data  in  the  image.  For  these  cases,  a top-down 
verification  procedure  should  be  called  to  gather  detailed  evidence  about  the 
object's  presence.  Failure  to  detect  the  mapped  object  should  either  aiert  the 
human  consultant  or  result  in  special  symbolization  on  a preliminary  cartographic 
product.  Finally,  certain  focused  searches  should  be  performed  in  order  to 
check  for  new  objects  which  were  not  previously  mapped  and  which  yielded  no  edge 
activity  in  the  preprocessed  image.  For  example,  drainage  networks  could  be 
scanned  for  new  bridges,  road  networks  could  be  scanned  for  new  connections  or 
for  vehicle  activity. 


92 


8.  Outstanding  problems  for  ACES 


Even  a modest  ACES  system  will  be  a complex  system  built  on  the  most 
recent  accomplishments  in  computer  science,  cartography,  and  electronics. 

Many  installations  are  currently  implementing,  or  have  implemented,  systems 
which  are  designed  to  register  Landsat  imagery  to  maps  for  interactive  map 
update.  Problems  whose  current  solutions  are  regarded  as  good  enough  to  support 
such  systems  are  as  follows. 

. Radiometric  correction 
. Geometric  correction 
. Registration 
. Regridding 

Current  systems  still  require  the  human  for  feat  re  extraction  and  making  map 
revision  decisions.  Problems  remain  for  the  automatic  extraction  of  features 
and  the  management  and  use  of  real-world  knowledge  necessary  in  that  process. 
Certain  aspects  of  these  problems  are  discussed  below. 

8.1  Extraction  of  primitive  image  features 

In  the  opinion  of  the  author,  no  research  has  demonstrated  that  satisfactory 
low-level  primitive  extraction  can  be  accomplished  automatically  on  a varied  set 
cf  imagery.  What  is  needed  is  a reliable  procedure  for  segmenting  imagery  into 
primitive  regions  or  boundaries.  It  appears  unlikely  that  accurate  and  detailed 
segmentations  can  be  gotten  without  the  direction  of  knowledge  at  a very  high 
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Level.  Accepting  this  conclusion,  one  must  then  hope  that  enough  accurate 
detail  can  be  extracted  automatically  so  that  sufficient  context  is  available 
to  efficiently  evoke  the  correct  knowledge  to  interpret  the  remaining  weak 
detail.  This  implies  a combined  data-directed  and  knowledge-directed  procedure 
for  primitive  extraction.  Control  of  such  a procedure  is  a problem  of  much  current 
interest  in  A. I. 

8.2  Encoding  and  using  knowledge 

Several  sections  of  this  report  have  discussed  the  use  and  encoding  of 
knowledge.  Many  obvious  examples  have  been  given  where  a priori  knowledge  was 
necessary  and  sufficient  to  interpret  imagery.  Certain  of  these  applications  of 
knowledge  were  even  easy  to  program  for  automatic  decision  making.  However, 
there  is  yet  no  knowledge  encoding  and  manipulation  paradigm  that  can  implement 
a rich  set  of  geographic  information.  Perhaps  the  most  useful  device  currently 
available  is  to  use  a priori  positional  (iconic)  knowledge  stored  in  a geographic 
data  base.  This  device  has  not  received  development  in  proportion  to  its  potential, 
partly  because  of  the  difficult  access  to  GDB's  in  the  A. I.  community  and  partly 
because  of  the  research  community's  infatuation  with  higher  level  knowledge  sources. 

8.3  Detection  and  treatment  of  ambiguous  situations 

The  research  community  has  not  yet  learned  to  handle  ambiguous  situations 
with  automatic  programs.  This  is  a fundamental  issue.  The  problem  may  also  be 
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viewed  as  che  problem  of  switching  Co  higher  levels  of  knowledge  application 
to  make  decisions.  Heterarchical  approaches,  which  change  levels  whenever  the 
context  arises,  have  been  tried  with  astonishing  success  in  small  domains  [Winograd 
1971],  but  the  approach  seems  to  be  unmanageable  in  rich  domains.  Hierarchical 
relaxation  is  being  tried  [Hayes  1977].  The  idea  is  to  preserve  all  possible 
ambiguous  interpretations  at  analysis  level  i and  pass  them  onto  higher  levels  i+j 
for  refinement.  The  method  will  apparently  suffer  from  an  explosion  of  possibilities 
which  level  i is  too  uninformed  to  dism1ss;but,  results  are  not  yet  in. 

Concrete  examples  should  be  considered  before  continuing.  In  region  growing, 
the  decision  must  be  made  to  merge  or  not  merge  a given  set  of  pixels  with  a neigh- 
boring set.  The  two  sets  of  pixels  have  different  properties  but  they  are  more 
similar  in  properties  than  any  other  pair  of  sets.  By  switching  to  a higher  level 
where  it  might  be  known  to  which  object(s)  the  two  sets  of  pixels  belong,  the 
merging  decision  is  easily  made  - - merge  only  if  both  sets  are  from  the  same  object. 
Under  a relaxation  approach  merging  could  not  safely  be  done  because  it  will  be 
irrevocable  at  a higher  level,  and  possibly  be  incorrect.  A lineal  feature  con- 
necting to  and  disappearing  in  a region  feature  might  be  a road  disappearing  in 
a forest  canopy,  a road  terminating  at  an  open  water  boundary,  a stream  dumping 
into  open  water,  or  a stream  disappearing  under  a forest  canopy.  Resolution  of 
the  situation  can  be  handled  at  a level  higher  than  the  lineal  tracker,  especially 
by  relaxation  labeling  as  done  in  Section  6.4.1,  but  the  information  might  be 
critical  for  continued  performance  of  the  tracker.  Should  the  tracking  continue 
into  the  region  to  which  the  lineal  has  connected? 
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8.4  Control  of  ACES  interaction  with  a human  interpreter 

Although  ACES  would  ideally  be  totally  automatic,  this  goal  does  not  seem 
warranted  for  the  near  future  and  human  guidance  must  be  used  when  appropriate. 

But  how  should  the  interaction  be  controlled?  The  human  is  neither  desireous 
nor  capable  of  communicating  with  computer  algorithms  in  terms  of  computer 

problem  representations;  i.e.  data  structures,  state  descriptions,  prenex  mormal  , 

forms,  probabilities,  etc.  Humans  are  masters  of  the  linguistic  and  visual  domains. 
Fortunately  the  task  is  in  the  visual  domain  and  good  hardware  devices  are  avail- 
able for  graphics.  The  linguistic  domain,  unfortunately,  is  not  as  well  understood  as 
many  researchers  would  have  us  believe  - - again  because  of  the  problem  of  a priori 
knowledge.  Humans  can  efficiently  supply  global  chunks  of  knowledge  to  the  com- 
puter process,  but  not  local  chunks.  Joining  the  man  and  machine  Is  likely  to 
require  a solution  to  the  context  switching  problem  discussed  in  Section  8.3. 

8.5  Change  detection 

In  comparing  raw  image  data  to  archived  base  maps  a comparison  is  made 
between  instructured  real  data  and  highly  structured  symbolic  data.  The  equiv- 
alence class  of  raw  images  having  the  same  symbolic  map  is  very  large  due  to 
nuances  such  as  lighting  differences,  season  change,  or  sensor  change,  or  due 
to  the  objectives  of  interpretation.  Do  we  map  a puddle  on  the  village  green, 
or  a ship  passing  through  a canal?  Ultimately,  change  detection  must  evaluate 
significance  and  hence  must  be  implemented  using  high  levels  of  knowledge. 
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8.6  Multidate  functioning 


Use  has  been  made  of  multidate  Landsat  imagery  for  land  classification, 
especially  for  vegetation  classification  [ Kalensky  1974  ] _ we  should 

assume  that  registration  procedures  are  currently  good  enough  to  produce  useful 
multidate  imagery  although  there  will  be  confused  areas  near  region  boundaries. 
Road  tracking  may  be  done  better  in  winter  imagery  while  vegetation  regions 
might  better  be  gotten  from  summer  imagery.  Crop  classification  requires  imagery 
from  several  dates.  Management  of  multidate  image  data  and  use  of  it,  together 
with  temporal  a priori  knowledge,  in  mapping  decisions  presents  another  problem 
for  ACES  control  and  knowledge  base. 
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9.  Items  for  future  work/research 


The  preceding  sections  of  this  report  have  roughed  out  paradigms 
for  automation  of  image  analysis  in  the  production  of  cartographic  products. 

The  principle  conclusion  reached  from  the  assessment  of  pattern  recognition 
and  artificial  intelligence  techniques  is  that  useful  automation  will  not  be 
obtained  unless  large  amounts  of  a priori  real  world  knowledge  is  available 
to  image  analysis  procedures.  It  is  also  evident  that  only  a small  fraction 
of  knowledge  available  to  a human  interpreter  may  be  available  to  an  automatic 
process,  largely  because  of  limitations  in  encoding  knowledge,  in  combining 
knowledge  (in  supporting  or  contradictory  manners),  or  in  accessing  the  know- 
ledge or  the  data  on  which  it  is  to  operate.  Knowledge  engineering  in  A. I. 
has  progressed  only  to  the  point  of  establishing  expert  systems  in  very 
limited  domains.  Examples  of  such  systems  are  MYCIN  [Shortliffe  1976], 
PROSPECTOR  [ Duda  1977  ],  and  HEARSAY  [Reddy  et  al  1973]. 

Automatic  image  analysis  necessary  to  fully  support  cartographic  compil- 
ation would  require  far  more  sophisticated  knowledge  than  what  has  previously 
been  embodied  in  any  existing  expert  system.  However,  it  should  be  possible 
to  perform  several  compilation  tasks  using  current  techniques.  The  most  promis- 
ing approach  appears  to  be  that  of  encoding  in  the  knowledge  base  knowledge  of 
only  a positional  nature  and  foregoing  the  storage  of  other  types  of  knowledge. 
Other  types  of  knowledge  may  reappear  as  procedural  knowledge  in  specific  pro- 
cessing tasks. 
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Encoding  positional  type  knowledge  allows  straightforward  comparison 
of  new  data  with  knowledge  stored  in  the  data  base  according  to  geographic 
position.  This  should  allow  simpler  programs  with  acceptable  computational 
complexity.  It  might  be  further  argued  that  the  "neighborhood  of  applicability" 
of  positional  knowledge  be  small  so  that  array  or  raster  processing  is  possible 
in  applying  the  knowledge.  In  addition  to  simplicity  there  is  a stronger 
reason  for  applying  positional  knowledge,  and  that  is  that  we  already  have  huge 
amounts  of  positional  knowledge  encoded  in  our  current  cartographic  data  bases. 
Human  digitization  of  cartographic  features  could  then  be  reviewed  as  a boot- 
strapping process  for  further  automated  analysis.  Items  of  future  work  directed 
toward  completion  of  the  steps  necessary  for  map-guided  interpretation  are  dis- 
cussed below. 

9.1  Registration  of  images  to  base  maps 

To  unlock  positional  knowledge  stored  in  a map  archive  new  source  imagery 
must  be  registered  to  the  symbolic  representation  in  the  archive.  It  is  always 
known  from  navigational  technology  approximately  what  area  of  the  earth  is  imaged 
and  thus  appropriate  areas  of  archived  symbology  can  be  accessed.  Local  adjust- 
ments will  be  necessary  to  rotate, translate,  stretch,  or  deform  the  image  data 
so  that  points  with  global  archive  coordinates  can  be  located  in  terms  of  image 
coordinates,  or  visa  versa.  These  adjustments  must  be  made  by  recognizing  the 
correspondence  between  key  (control)  map  and  image  features  and  defining  the 
registration  transformation  from  the  known  corresponding  features.  Unique  point 

) 
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features  are  often  chosen  for  control  and  block  correlation  Is  used  to  automat- 


ically detect  the  correspondence  between  map  and  image  points.  Unique  edge 
features  are  also  useful  for  registration.  Arnold  [1977]  has  asserted  that 
point  features  are  typically  better  for  natural  scenes  while  edge  features  are 
usually  better  for  scenes  with  man-made  structure.  The  greatest  difficiency 
of  current  automatic  registration  procedures  is  that  feature  correspondences 
are  decided  upon  one  pair  at  a time.  Correspondence  errors  are  frequently  made 
and  contribute  to  inaccurate  registration  transformations.  L.N.K.  has  recently 
developed  a registration  concept  using  clustering  evidence  based  on  possible 
pairwise  correspondences  [Stockman  1978].  The  transformation  that  explains  the 
largest  set  of  feature  pairs  is  chosen.  A fair  proportion  of  incorrect  corres- 
pondences can  be  tolerated  and  will  have  no  effect  on  the  resulting  transformation. 
Although  the  technique  has  proven  successful  in  tests  on  3 different  data  sets  more 
development  and  testing  is  required.  First  of  all  it  must  be  tested  on  natural 
terrain.  Secondly,  the  technique  must  be  adjusted  to  handle  other  than  RS  and  T 
registration  transformations. 

9.2  Detection  of  lineal  cartographic  features  in  source  imagery 

If  registration  as  described  in  Section  9.1  is  to  be  accomplished  certain 
key  features  must  be  acquired  in  the  absence  of  the  archived  data.  In  particular, 
curved  edge  segments  representing  lineal  features  (roads,  streams,  etc.)  or  the 
boundaries  of  regions  (land-water,  forest-field,  etc.)  must  be  gotten  in  a totally 
automatic  fashion.  For  scenes  with  much  man-made  structure  it  is  easy  to  get  a 
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sufficient  set  of  edge  features,  but  there  may  be  problems  with  natural 
terrain.  More  work  needs  to  be  done  with  natural  terrain  and  point  features 
to  insure  that  registration  can  be  successfully  accompJ ished . 

Techniques  must  be  developed  for  tracking  the  paths  of  archived  curves 
in  the  source  imagery.  The  registration  transformation  allows  points  of  the 
GDB  to  be  positioned  in  the  imagey  but  perfect  feature  correspondence  may  not 
be  achieved  pointwise  because  of  1)  local  distortion  (i.e.  no  real  change), 

2)  insignificant  change  in  the  feature  (i.e.  land-water  boundary  change  due 
to  the  tide),  and  3)  significant  change  in  the  feature  (i.e.  the  widening  of 
a road)  . Many  tracking  techniques  exist  but  more  research  is  required  in 
order  to  interpret  the  differences  between  archive  and  image  tracks  which  will 
frequently  occur. 

New  lineal  features  in  the  source  imagery  must  also  be  extracted.  One 
assumption  that  can  be  made  from  a priori  knowledge  is  that  new  lineals  must 
connect  to  an  existing  network.  Techniques  for  tracking  archived  lineal 
features  and  detecting  new  connections  should  be  developed. 

9.3  Analysis  of  regions  in  source  imagery 

Region  content  of  source  imagery  must  be  checked  for  significant  change 
with  respect  to  the  GDB.  Changes  may  consist  of  changes  to  the  boundary  or  to 
the  interior  of  the  region.  Changes  to  the  boundary  of  regions  may  be  detected 
by  the  same  approach  used  for  lineals  in  Section  9.2.  Another  approach  based 
on  tonal  or  textural  pixel  features  and  useful  for  checking  the  interior  of  a 
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mapped  region  for  change  is  as  follows.  Let  be  a (discrete)  set  of  points 
specifying  the  interior  of  region  R in  the  GDB . Let  R^  be  the  corresponding 
set  of  image  points  of  R gotten  via  the  registration  transformation.  Histograms 
of  the  features  of  the  set  of  pixels  R^  provide  information  on  the  structure  of 
the  region.  A major  peak  should  exist  in  each  histogram  indicating  region  pixels 
while  minor  peaks  should  exist  for  non-region  pixels.  Large  minor  peaks  could 
be  due  to  a change  in  the  region  boundary  or  to  the  introduction  of  new  objects 
in  the  interior.  The  structure  of  the  peaks  should  be  most  significant  and 
should  be  independent  of  uniform  change  such  as  a sensor  calibration  shift,  light- 
ing condition,  or  even  seasonal  change.  In  certain  cases  absolute  or  relative 
peak  displacement  could  be  interpreted.  In  any  case  the  region  feature  is  mapped, 
and  hence  known,  so  that  feature  specific  interpretation  is  possible.  Consider 
a mapped  lake  with  unmapped  islands  appearing  in  the  source  imagery.  The  tone 
of  the  island  would  differ  from  that  of  the  water  and  two  distinct  peaks  would 
be  visible  in  the  histogram.  As  a second  example  consider  the  browning  of  a 
deciduous  forest  or  crop  in  the  Fall.  A definite  histogram  peak  shift  would  be 
observed  in  the  green  band  of  MSS  data.  For  resource  monitoring  this  may  be  inter- 
preted as  significant  change  but  for  cartographic  purposes  the  similar  peak  struc- 
ture is  indicative  of  no  significant  change.  Map-guided  region  analysis  as  de- 
scribed here  is  another  topic  for  future  work. 
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10.  Conclusions 


The  key  problem  in  automatic  cartographic  feature  extraction  lies  in  the 
use  of  knowledg  . necessary  for  the  extraction  process.  Knowledge  must  be 
applied  in  various  forms  and  at  various  levels  of  interpretation.  The  use  of 
knowledge  in  a rich  domain  is  the  central  issue  in  A. I.  today.  Achievements 
have  only  been  impressive  where  limited  domains  were  considered  [Winograd  1973, 
Duda  1977,  Feigenbaum  1977].  It  is  therefore  presumptuous  to  expect  at  this 
time  a paradigm  to  explain  the  complete  interpretation  of  source  imagery,  and 
no  such  unified  paradigm  was  developed  in  this  report. 

It  is  possible  to  consider  paradigms  which  address  part  of  the  interpre- 
tation process.  That  is,  there  are  candidate  partial  theories  which  either 
operate  at  certain  levels  of  the  interpretation  process,  or  which  operate  at 
multiple  levels  but  implement  only  a fraction  of  the  available  knowledge.  The 
paradigm  of  map-guided  interpretation  seems  to  be  the  most  promising  at  the  cur- 
rent time.  This  paradigm  depends  upon  the  registration  of  source  imagery  to  the 
positional  knowledge  stored  in  a geographic  data  base  (GDB) . Useful  GDB's  cur- 
rently exist  and  application  of  positional  knowledge  in  a computer  is  straight- 
forward. Source  imagery  from  an  area  without  a base  map  could  not  be  handled 
under  this  partial  paradigm  and  would  have  to  be  handled  by  current  manual  tech- 
niques. But,  by  such  "bootstrapping"  of  knowledge,  the  system  could  be  automatic 
on  repeat  coverage. 
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Since  humans  can  readily  apply  general  knowledge  in  analyzing  scenes 
never  before  viewed,  paradigms  other  than  map-guided  interpretation  are  of 
great  interest.  In  particular,  bottom-up  feature  recognition  seems  to  be 
necessary.  (It  is  also  necessary  for  registration  in  the  map-guided  approach.) 

The  ability  of  humans  to  interpret  black  and  white  photographs  is  particularly 
amazing  in  light  of  current  difficulties  with  automatic  interpretation.  A.  I. 
is  currently  hung  up  on  the  problem  of  scene  segmentation  and  it  is  the  con- 
clusion of  this  study  that  very  high  level  knowledge  is  necessary  in  order  to 
segment  black  and  white  (B&W)  imagery.  This  is  because  global  spatial  relation- 
ships and  causal  relationships  not  evident  in  the  data  itself  are  necessary  in 
order  to  determine  the  function  or  content  of  objects  in  the  scene. 

The  application  of  high  levels  of  knowledge  in  the  bottom-up  interpretation 
of  black  and  white  imagery  should  be  vigorously  pursued.  However,  for  near-future 
automation  of  cartographic  feature  extraction  it  is  recommended  that  attention 
should  be  paid  to  MSS  data.  Use  of  MSS  data  permits  the  process  of  symbolization 
(interpretation)  to  occur  at  a very  low  level  in  the  processing.  This  can  reduce 
the  volume  of  data  and  the  amount  of  ambiguity  passed  on  to  higher  level  analyses. 
Current  deterrents  to  MSS  use,  i.e.  registration,  resolution,  cost  and  tradition, 
appear  to  be  practical  not  theoretical  considerations. 
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Under  the  top-down,  map-guided,  paradigm  both  B&W  and  MSS  imagery  can 
be  handled,  although  feature  tracking  and  verification  would  be  more  difficult 
(theoretically)  in  the  B&W  case.  Sections  6 and  7 sketched  an  ACES  system  struc- 
ture that  would  perform  map-guided  interpretation.  For  practical  reasons  a human 
consultant  is  also  in  the  process.  Further  research  and  development  is  necessary 
to  test  and  perfect  the  proposed  concepts  in  map-guided  interpretation.  There 
are  interesting  problems  for  future  work,  and  hopefully  large  gains  to  be  made  in 
automated  map  compilation. 
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